Salmon Run: Solr: a custom Search RequestHandler



Salmon Run: Solr: a custom Search RequestHandler

As you know, I've been playing with Solr lately, trying to see how feasible it would be to customize it for our needs. We have been a Lucene shop for a while, and we've built our own search framework around it, which has served us well so far. The rationale for moving to Solr is driven primarily by the need to expose our search tier as a service for our internal applications. While it would have been relatively simple (probably simpler) to slap on an HTTP interface over our current search tier, we also want to use the other Solr features such as incremental indexing and replication.

One of our challenges to using Solr is that the way we do search is quite different from the way Solr does search. A query string passed to the default Solr search handler is parsed into a Lucene query and a single search call is made on the underlying index. In our case, the query string is passed to our taxonomy, and depending on the type of query (as identified by the taxonomy), it is sent through one or more sub-handlers. Each sub-handler converts the query into a (different) Lucene query and executes the search against the underlying index. The results from each sub-handler are then layered together to present the final search result.

Conceptually, the customization is quite simple - simply create a custom subclass of RequestHandlerBase (as advised on this wiki page) and override the handleRequestBody(SolrQueryRequest, SolrQueryResponse) method. In reality, I had quite a tough time doing this, admittedly caused (at least partly) by my ignorance of Solr internals. However, I did succeed, so, in this post, I outline my solution, along with some advice I feel would be useful to others embarking on a similar route.


Read full article from Salmon Run: Solr: a custom Search RequestHandler


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts