Balík: frog (0.15-1) [debports]
Odkazy pre frog
- Domovská stránka [languagemachines.github.io]
tagger and parser for natural languages (runtime)
Memory-Based Learning (MBL) is a machine-learning method applicable to a wide range of tasks in Natural Language Processing (NLP).
Frog is a modular system integrating a morphosyntactic tagger, lemmatizer, morphological analyzer, and dependency parser for natural languages. It is based upon it's predecessor TADPOLE (TAgger, Dependency Parser, and mOrphoLogical analyzEr). Using Memory-Based Learning techniques, frog tokenizes, tags, lemmatizes, and morphologically segments word tokens in incoming UTF-8 text files, and assigns a dependency graph to each sentence. Frog is particularly targeted at the increasing need for fast, automatic NLP systems applicable to very large (multi-million to billion word) document collections that are becoming available due to the progressive digitization of both new and old textual data. Up to now, frog has only been tested and used using corpora of Dutch natural language (see the frogdata package for samples).
Frog is a product of the Centre of Language and Speech Technology at Radboud University Nijmegen, it subsumes previous work by the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium).
If you do scientific research in NLP, Frog will likely be of use to you.
Ostatné balíky súvisiace s balíkom frog
- dep: libfolia9
- Implementation of the FoLiA document format
- dep: libfrog1
- tagger and parser for Dutch language (library)
- dep: libgcc1 (>= 1:3.0)
- podporná knižnica GCC
- dep: libgomp1 (>= 4.2.1)
- podporná knižnica GCC OpenMP (GOMP)
- dep: libicu63 (>= 63.1-1~)
- medzinárodné komponenty pre Unicode
- dep: libmbt1
- memory-based tagger-generator and tagger - runtime
- dep: libstdc++6
- štandardná knižnica C++ GNU v3
- dep: libticcutils5
- utility functions used in the context of Natural Language Processing (library)
- dep: libtimbl4
- Tilburg Memory Based Learner - runtime
- dep: libucto3
- Unicode Tokenizer - runtime
- dep: libxml2 (>= 2.6.27)
- knižnica GNOME XML
- rec: ucto
- Unicode Tokenizer