Your email: Latest Posts Posted by Juliet Kemp on Mon, Sep 23, 2013 Apache Tika is a content-mining library that allows you to pull both metadata and text content out of documents of many different types. Instead of having to turn to a variety of different parser libraries, each offering slightly different options, you can learn how to use Tika and its API once and apply it to any format that Tika supports , including RTF, Microsoft Office formats,
Read full article from Content mining with Apache Tika
No comments:
Post a Comment