Apache Stanbol - Stanbol Enhancer Natural Language Processing Support



Apache Stanbol - Stanbol Enhancer Natural Language Processing Support

Integrated NLP frameworks

  • OpenNLP: Apache OpenNLP is the default NLP processing framework used by Stanbol. OpenNLP supports Sentence Detection, Tokenization, Part of Speech tagging, Chunking and Named Entity Recognition for several languages. Users can extend support to additional languages by providing their own statistical models.

  • Smartcn: The Lucene Smartcn Analyzer integration provides basic language support for Chinese by providing Sentence Detection and Tokenization engines.

  • Paoding: The Paoding Analyzer is an alternative to Smartcn for basic Chinese language support. Paoding only supports Tokenization and is therefore best used in combination with the Smartcn Sentnece Detection engine.

  • CELI / linguagrid.org: Celi contributed Stanbol EnhancementEngines based on their NLP processing Framework. It supports Named Entity Recognition for French and Italien as well as Lemmatization and lexical analysis for Italien, Danish, Russian, Romanian and Swedish. In addition CELI also provides a Language identification service

    NOTE: This Engine will send processed to the CELI server. Users are required to create an account for the CELI service.

  • Gosen: Lucene-Gosen is an LGPL licensed Analyzer for Japanese. The Apache Stanbol Integration supports Sentence Detection, Tokenization, Part of Speech tagging as well as Named Entity Recognition.

    NOTE: As the license of Lucene-Gosen is not compatible with the ASL this project is hosted on https://github.com/westei/stanbol-gosen and is NOT a part of Apache Stanbol. Users that want to use it will need to download it themselves.

  • Freeling: Freeling is an GPL licensed NLP processing framework implemented in C. It supports Sentence Detection, Tokenization, Part of Speech tagging, Chunking and Named Entity Recognition for several languages including English, Spanish, Italian, Russian and Portuguese.

    The integration is based on the RESTful NLP analysis service specification. That means that users will need to install and configure Freeling and than run the Stanbol Freeling Server. After that they can use this server by configuring the RESTful NLP Analysis Engine with the /analysis as well as the RESTful NLP Language Identification Engine with the /langident endpoint of their Stanbol Freeling Server.

    NOTE: As the license of Freeling is not compatible with the ASL this project is hosted on https://github.com/insideout10/stanbol-freeling and is NOT a part of Apache Stanbol. Users that want to use it will need to download and install it themselves.

  • Talismane: Talismane is an AGPL licensed NLP processing framework implemented in Java. It supports Sentence Detection, Tokenization, Part of Speech tagging for French.

    The integration is based on the RESTful NLP analysis service specification. That means that users will need to download and build the Stanbol-Talismane project and than run the Stanbol Talismane Server. After that they can use this server by configuring the RESTful NLP Analysis Engine with the /analysis endpoint of their Stanbol-Talismane server

    NOTE: As the license of Talismane is not compatible with the ASL this project is hosted on https://github.com/westei/stanbol-talismane and is NOT a part of Apache Stanbol. Users that want to use it will need to download and install it themselves.


Read full article from Apache Stanbol - Stanbol Enhancer Natural Language Processing Support


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts