软件包: frog (0.12.15-3)
- Debian Science Team (QA Page, Mail Archive)
- Joost van Baal-Ilić (QA Page)
- Ko van der Sloot (QA Page)
- 主页 [ilk.uvt.nl]
tagger and parser for Dutch language
Memory-Based Learning (MBL) is a machine-learning method applicable to a wide range of tasks in Natural Language Processing (NLP).
Frog is a modular system integrating a morphosyntactic tagger, lemmatizer, morphological analyzer, and dependency parser for the Dutch language. It is based upon it's predecessor TADPOLE (TAgger, Dependency Parser, and mOrphoLogical analyzEr). Using Memory-Based Learning techniques, Tadpole tokenizes, tags, lemmatizes, and morphologically segments word tokens in incoming Dutch UTF-8 text files, and assigns a dependency graph to each sentence. Tadpole is particularly targeted at the increasing need for fast, automatic NLP systems applicable to very large (multi-million to billion word) document collections that are becoming available due to the progressive digitization of both new and old textual data.
NB: Frog can be considered alpha software, and is in a fair state of flux.
Frog is a product of the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).
If you do scientific research in NLP, Frog will likely be of use to you.
其他与 frog 有关的软件包
- dep: libfolia1
- implementation of the FoLiA document format
- dep: libgcc1 (>= 1:4.1.1)
- GCC 支持库
- dep: libgomp1 (>= 4.2.1)
- GCC OpenMP (GOMP) support library
- dep: libicu48 (>= 4.8-1)
- International Components for Unicode
- dep: libmbt0
- memory-based tagger-generator and tagger - runtime
- dep: libpython2.7 (>= 2.7)
- Shared Python runtime library (version 2.7)
- dep: libstdc++6 (>= 4.4.0)
- GNU Standard C++ Library v3
- dep: libtimbl3
- Tilburg Memory Based Learner - runtime
- dep: libtimblserver2
- Server extensions for Timbl - runtime
- dep: libucto1
- Unicode Tokenizer - runtime
- dep: python (>= 2.5)
- interactive high-level object-oriented language (default version)
- dep: python-support (>= 0.90.0)
- 为 Python 模块提供自动重新构建支持
- dep: ucto
- Unicode Tokenizer