全部搜索项
wheezy  ] [  jessie  ] [  stretch  ] [  buster  ] [  sid  ]
[ 源代码: ucto  ]

软件包:ucto(0.5.2-2)

ucto 的相关链接

Screenshot

Debian 的资源:

下载源码包 ucto

维护小组:

外部的资源:

相似软件包:

Unicode Tokenizer

Ucto can tokenize UTF-8 encoded text files (i.e. separate words from punctuation, split sentences, generate n-grams), and offers several other basic preprocessing steps (change case, count words/characters and reverse lines) that make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation.

Ucto is a product of the ILK Research Group, Tilburg University (The Netherlands).

If you are interested in machine parsing of UTF-8 encoded text files, e.g. to do scientific research in natural language processing, ucto will likely be of use to you.

标签: 实做语言: C++, 角色: 程序

其他与 ucto 有关的软件包

  • 依赖
  • 推荐
  • 建议
  • 增强

下载 ucto

下载可用于所有硬件架构的
硬件架构 软件包大小 安装后大小 文件
amd64 36.7 kB153.0 kB [文件列表]