Lucene – using the HitCollector
I’m running a webshop where products are assigned to categories. Soon there was the requirement to count the search hits in specific categories.
First, I extended TopFieldDocCollector which collects the top-sorting documents, returning them as TopFieldDocs. I override the collect method so it counts every hit per category for the search result. Note that the collect method gets called once for every hit. The next question is how to call the search with the custom HitCollector
I’m running a webshop where products are assigned to categories. Soon there was the requirement to count the search hits in specific categories.
public class CountCollector extends TopFieldDocCollector { private Searcher searcher = null; private Map countMap = new HashMap(); public CountCollector(Searcher searcher, IndexReader reader, Sort sorter, int maxSearchResults) throws IOException { super(reader, sorter, maxSearchResults); this.searcher = searcher; } public void collect(int doc, float score) { super.collect(doc, score); try { Document document = searcher.doc(doc); if (document != null) { Field[] categoriesDoc = document.getFields("categories"); if (categoriesDoc != null && categoriesDoc.length > 0) { for (int i = 0; i < categoriesDoc.length; i++) { if (countMap.containsKey(categoriesDoc[i].stringValue())) { countMap.put(categoriesDoc[i].stringValue(), new Long(countMap.get(categoriesDoc[i].stringValue()) + 1)); } else { countMap.put(categoriesDoc[i].stringValue(), new Long("1")); } } } } } catch (CorruptIndexException e) { System.err.println("ERROR: " + e.getMessage()); } catch (IOException e) { System.err.println("ERROR: " + e.getMessage()); } }}CountHitCollector collector = new CountHitCollector(searcher, indexReader, sorter, maxSearchResults);searcher.search(finalQuery, collector);ScoreDoc[] hits = collector.topDocs().scoreDocs;
The search method returns no Hits Object. The result documents are saved in the HitCollector. Next you can just step through the ScoreDoc array and do
1
| Document doc = searcher.doc(hits[i].doc); |
for each document and put it in a custom result object.
Please read full article from Lucene – using the HitCollector
No comments:
Post a Comment