Package: ucto (0.5.2-2)
Links for ucto
Debian Resources:
Download Source Package ucto:
Maintainers:
- Debian Science Team (QA Page, Mail Archive)
- Joost van Baal-Ilić (QA Page)
- Ko van der Sloot (QA Page)
External Resources:
- Homepage [ilk.uvt.nl]
Similar packages:
Unicode Tokenizer
Ucto can tokenize UTF-8 encoded text files (i.e. separate words from punctuation, split sentences, generate n-grams), and offers several other basic preprocessing steps (change case, count words/characters and reverse lines) that make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation.
Ucto is a product of the ILK Research Group, Tilburg University (The Netherlands).
If you are interested in machine parsing of UTF-8 encoded text files, e.g. to do scientific research in natural language processing, ucto will likely be of use to you.
Other Packages Related to ucto
|
|
|
-
- dep: libc6 (>= 2.6)
- Embedded GNU C Library: Shared libraries
also a virtual package provided by libc6-udeb
-
- dep: libfolia1
- implementation of the FoLiA document format
-
- dep: libgcc1 (>= 1:4.1.1)
- GCC 支持库
-
- dep: libicu48 (>= 4.8-1)
- International Components for Unicode
-
- dep: libstdc++6 (>= 4.4.0)
- GNU Standard C++ Library v3
-
- dep: libucto1
- Unicode Tokenizer - runtime
-
- dep: libxml2 (>= 2.6.27)
- GNOME XML library
Download ucto
| Architecture | Package Size | Installed Size | Files |
|---|---|---|---|
| sparc | 35.2 kB | 121.0 kB | [list of files] |
