Apache Stanbol - Stanbol Enhancer Natural Language Processing Support
Integrated NLP frameworks
-
OpenNLP: Apache OpenNLP is the default NLP processing framework used by Stanbol. OpenNLP supports Sentence Detection, Tokenization, Part of Speech tagging, Chunking and Named Entity Recognition for several languages. Users can extend support to additional languages by providing their own statistical models.
-
Smartcn: The Lucene Smartcn Analyzer integration provides basic language support for Chinese by providing Sentence Detection and Tokenization engines.
-
Paoding: The Paoding Analyzer is an alternative to Smartcn for basic Chinese language support. Paoding only supports Tokenization and is therefore best used in combination with the Smartcn Sentnece Detection engine.
-
CELI / linguagrid.org: Celi contributed Stanbol EnhancementEngines based on their NLP processing Framework. It supports Named Entity Recognition for French and Italien as well as Lemmatization and lexical analysis for Italien, Danish, Russian, Romanian and Swedish. In addition CELI also provides a Language identification service
NOTE: This Engine will send processed to the CELI server. Users are required to create an account for the CELI service.
-
Gosen: Lucene-Gosen is an LGPL licensed Analyzer for Japanese. The Apache Stanbol Integration supports Sentence Detection, Tokenization, Part of Speech tagging as well as Named Entity Recognition.
NOTE: As the license of Lucene-Gosen is not compatible with the ASL this project is hosted on https://github.com/westei/stanbol-gosen and is NOT a part of Apache Stanbol. Users that want to use it will need to download it themselves.
-
Freeling: Freeling is an GPL licensed NLP processing framework implemented in
C
. It supports Sentence Detection, Tokenization, Part of Speech tagging, Chunking and Named Entity Recognition for several languages including English, Spanish, Italian, Russian and Portuguese.The integration is based on the RESTful NLP analysis service specification. That means that users will need to install and configure Freeling and than run the Stanbol Freeling Server. After that they can use this server by configuring the RESTful NLP Analysis Engine with the
/analysis
as well as the RESTful NLP Language Identification Engine with the/langident
endpoint of their Stanbol Freeling Server.NOTE: As the license of Freeling is not compatible with the ASL this project is hosted on https://github.com/insideout10/stanbol-freeling and is NOT a part of Apache Stanbol. Users that want to use it will need to download and install it themselves.
-
Talismane: Talismane is an AGPL licensed NLP processing framework implemented in Java. It supports Sentence Detection, Tokenization, Part of Speech tagging for French.
The integration is based on the RESTful NLP analysis service specification. That means that users will need to download and build the Stanbol-Talismane project and than run the Stanbol Talismane Server. After that they can use this server by configuring the RESTful NLP Analysis Engine with the
/analysis
endpoint of their Stanbol-Talismane serverNOTE: As the license of Talismane is not compatible with the ASL this project is hosted on https://github.com/westei/stanbol-talismane and is NOT a part of Apache Stanbol. Users that want to use it will need to download and install it themselves.
Read full article from Apache Stanbol - Stanbol Enhancer Natural Language Processing Support
No comments:
Post a Comment