Leveraging UIMA in Spark | Spark Summit
June 30 - July 2, 2014 Spark Summit 2014 brought the Apache Spark community together on June 30- July 2, 2015 at the The Westin St. Francis in San Francisco. It featured production users of Spark, Shark, Spark Streaming and related projects. Philip Ogren (Oracle) Much of the Big Data that Spark welders tackle is unstructured text that requires text processing techniques. For example, performing named entity extraction on tweets or sentiment analysis on customer reviews are common activities. The Unstructured Information Management Architecture (UIMA) framework is an Apache project that provides APIs and infrastructure for building complex and robust text analytics systems. A typical system built on UIMA defines a collection of analysis engines (such as e.g. a tokenizer, part-of-speech tagger, named entity recognizer, etc.) which are executed according to arbitrarily complex flow control definitions.Read full article from Leveraging UIMA in Spark | Spark Summit
No comments:
Post a Comment