Lucene Collector使用例子



Lucene Collector使用例子
Lucene的Collector是一项高级功能,涉及到搜索过程的详细步骤,通过定制Collector可以改变默认搜索,也可以在此阶段收集命中的文档的信息。

下面用一个例子演示如何自定义Collector以及如何在自定义的Collector中实现数据收集功能。
  1. class MyCollector extends Collector {  
  2.     IndexReader reader = null;//reader用于读取获取文档  
  3.     //收集信息的map  
  4.     public Map<Integer, Integer> map = new TreeMap<Integer, Integer>();  
  5.     @Override  
  6.     public boolean acceptsDocsOutOfOrder() {  
  7.         // TODO Auto-generated method stub  
  8.         return true;  
  9.     }  
  10.     /** 
  11.      * 收集 
  12.      */  
  13.     @Override  
  14.     public void collect(int doc) throws IOException {  
  15.         System.out.println("doc:" + doc);  
  16.         Document document = reader.document(doc);  
  17.         int id = Integer.parseInt(document.get("id"));  
  18.         int count = Integer.parseInt(document.get("count"));  
  19.         map.put(id, count);  
  20.         System.out.println("put:" + id + " " + count);  
  21.     }  
  22.     @Override  
  23.     public void setNextReader(IndexReader reader, int docBase) throws IOException {  
  24.         this.reader = reader;//假设reader由多个subReader构成,那么本方法将被调用与subReader个数相同的次数  
  25.         System.out.println("set reader");  
  26.     }  
  27.     @Override  
  28.     public void setScorer(Scorer scorer) throws IOException {  
  29.         // do nothing  
  30.     }  
  31. }  

  1.         // 使用MatchAllDocsQuery结合filter进行搜索,使用自定义Collector对数据进行收集  
  2.         Searcher searcher = new IndexSearcher(directory);  
  3.         Searcher searcher2 = new IndexSearcher(directory2);  
  4.         Searcher searcher3=new ParallelMultiSearcher(new Searcher[]{searcher,searcher2});  
  5.         MyCollector collector = new MyCollector();  
  6.         RangeFilter filter = new RangeFilter("range""0""4"truetrue);  
  7.         searcher3.search(new MatchAllDocsQuery(), filter, collector);  
  8.         searcher3.close();  
  9.         directory.close();  
  10.         // 取出数据  
  11.         Map<Integer, Integer> map = collector.map;  
  12.         Set<Integer> keySet = map.keySet();  
  13.         for (int i : keySet) {  
  14.             System.out.println("<" + i + "," + map.get(i) + ">");  
  15.         } 
Please read full article from Lucene Collector使用例子

No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts