[Update] Accessing words around a positional match in Lucene 4



[Update] Accessing words around a positional match in Lucene 4
the key differences from the previous examples have to do with using an AtomicReader (see Uwe Schindler’s excellent talk on the subject for more details) instance, as well as some new parameters to pass in to the SpanTermQuery.getSpans() method I used and finally, how to access the term vectors. 

In the getSpans() method, the first parameter is essentially providing access to the Reader, the second parameter can be used to filter out documents and the third, the termContexts, can be used to enable better performance when looking up terms.
The other big change is accessing Term Vectors in the span lookup loop. You no longer need to use TermVectorMapper instances, but instead simply use instances of Terms, TermsEnum and DocsAndPositionsEnum, as in:
it spits out some info about the actual query, and then shows some context about the matches.  Note, of course, you could also extend this to access things like payloads and more.
Please read full article from [Update] Accessing words around a positional match in Lucene 4

No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts