Polishing SolrCloud Distributed Updates I've been meaning to polish SolrClou...



Polishing SolrCloud Distributed Updates I've been meaning to polish SolrClou...

1. Joel Bernstein moved past my initial humble attempts at allowing the java client CloudSolrServer to hash documents client side and route updates directly to the correct shard. He has iterated heavily on that issue, responding to feedback and suggestions. I've put off helping him get his work in for a long time, and it's finally been too long.

2. For most of this year, there have been sporadic reports of deadlock in the DistributedUpdateProcessor - it was only recently that I finally started looking into it, and while I think I know the cause, after working on #1, I had refreshed my code memory enough to wonder if trying to fix the current update distribution approach in DistributedUpdateProcessor was worth further time. The current approach buffered updates in small batches, while a streaming approach would be much nicer. I had held off on this initially because I thought there might be some tough problems to tackle - a renewed look at the code had me thinking I could hack something together rather quickly perhaps.

Read full article from Polishing SolrCloud Distributed Updates I've been meaning to polish SolrClou...


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts