Is there any good open-source or freely available Chinese segmentation algorithm available? -
as phrased in question, i'm looking free and/or open-source text-segmentation algorithm chinese, understand difficult task solve, there many ambiguities involed. know there's google's api, rather black-box, i.e. not many information of doing passing through.
the keyword text-segmentation chinese
should 中文分词
in chinese.
good , active open-source text-segmentation algorithm :
- 盘古分词(pan gu segment) :
c#
,snapshot
- ik-analyzer :
java
- ictclas :
c/c++, java, c#
,demo
- nlpbamboo :
c, php, postgresql
- httpcws : based on
ictclas
,demo
- mmseg4j :
java
- fudannlp :
java
,demo
- smallseg :
python, java
,demo
- nseg : nodejs
- mini-segmenter:
python
other
sample
google chrome (chromium) :
src
,cc_cedict.txt (73,145 chinese words/pharases)
in
text field
ortextarea
of google chrome chinese sentences, press ctrl+← or ctrl+→double click
on中文分词指的是将一个汉字序列切分成一个一个单独的词
Comments
Post a Comment