ANNIE is the only free open source rule-based NER system in Java I could find. It's written for news articles and I guess tuned for the MUC 6 task. It's good for proof of concepts, but getting a bit outdated. Main advantage is that you can start improving it without any knowledge in machine learning, nlp, well maybe a little java. Just study JAPE and give it a shot.
OpenNLP, Stanford NLP, etc. come by default with models for news articles and perform (just looking at results, never tested them on a big corpus) better than ANNIE. I liked the Stanford parser better than OpenNLP, again just looking at documents, mostly news articles.
Without knowing what your documents look like I really can't say much more. You should decide if your data is suitable for rules or you go the machine learning way and use OpenNLP or Stanford parser or Illinois tagger or anything. The Stanford parser seems more appropriate for just pouring your data, training and producing results, while OpenNLP seems more appropriate for trying different algorithms, playing with parameters, etc.
Read full article from Accuracy: ANNIE vs Stanford NLP vs OpenNLP with UIMA - Stack Overflow
No comments:
Post a Comment