WiscKey:LSM-Tree 写放大优化 - 简书



WiscKey:LSM-Tree 写放大优化 - 简书

WiscKey 的提出,主要是为了优化 LSM-Tree 的写放大问题。此前已经有不少论文讨论过这个问题,如 LSM-trie 和 PebblesDB,但是大部分优化方法都不是很彻底——简单说就是,优化效果太差,或者不够通用。WiscKey 提出的是一种比较通用、效果明显且简单易懂的方法。

其实 WiscKey 优化写放大的原理非常简单:在 LSM-Tree 上做 key 和 value 的分离存储

LSM-Tree 写放大的根本原因是,compaction 时为了保证数据有序进行大量数据(key 和 value)重写。实际上,需要保持有序的只有 key,如果将 key 和 value 分开存储,compaction 重写数据的时候,就只需要重写 key(和 value 的位置,简称 vpos)。这在 key size (简称 ksize)远小于 value size (简称 vsize)的场景(现实场景基本都是这样)降低写放大的效果十分明显。


Read full article from WiscKey:LSM-Tree 写放大优化 - 简书


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts