Exploring Lucene and Solr's TrieRange Capabilities
Since Lucene treats most everything as Strings, encoding numbers and dates and then utilizing them in ranges has always required a little extra work to make it perform well. Previously, one would have to have either use less precision or slower running queries in order to work with ranges that had a lot of distinct values. This is due to the need for Lucene to enumerate through a large number of terms.
Of course, this is only scratching the surface. The take away, though, is the new Trie stuff in L/S holds a lot of promise for speeding up range based numeric queries and further blurs the line between search engines and databases (I’d argue it makes search all that more compelling, but…) More importantly, it is not dependent on the index size, but instead the precision chosen. Essentially, it formalizes what many people have done in practice through the years with various field values.
- http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//contrib-queries/org/apache/lucene/search/trie/package-summary.html
- http://www.thetaphi.de/share/Schindler-TrieRange.ppt
No comments:
Post a Comment