B-Tree | Set 1 (Introduction) | GeeksforGeeks



B-Tree is a self-balancing search tree. In most of the other self-balancing search trees (like AVL and Red Black Trees), it is assumed that everything is in main memory.  When the number of keys is high, the data is read from disk in the form of blocks. Disk access time is very high compared to main memory access time. The main idea of using B-Trees is to reduce the number of disk accesses. Most of the tree operations (search, insert, delete, max, min, ..etc ) require O(h) disk accesses where h is height of the tree. 

B-tree is a fat tree. Height of B-Trees is kept low by putting maximum possible keys in a B-Tree node. Generally, a B-Tree node size is kept equal to the disk block size. Since h is low for B-Tree, total disk accesses for most of the operations are reduced significantly compared to balanced Binary Search Trees like AVL Tree, Red Black Tree, ..etc.
Properties of B-Tree
1) All leaves are at same level.
2) A B-Tree is defined by the term minimum degree ‘t’. The value of t depends upon disk block size.
3) Every node except root must contain at least t-1 keys. Root may contain minimum 1 key.
4) All nodes (including root) may contain at most 2t – 1 keys.
5) Number of children of a node is equal to the number of keys in it plus 1.
6) All keys of a node are sorted in increasing order. The child between two keys k1 and k2 contains all keys in range from k1 and k2.
7) B-Tree grows and shrinks from root which is unlike Binary Search Tree. Binary Search Trees grow downward and also shrink from downward.
8) Like other balanced Binary Search Trees, time complexity to search, insert and delete is O(Logn)
Java Implementation from http://algs4.cs.princeton.edu/62btrees/BTree.java.html
public class BTree1<Key extends Comparable<Key>, Value> {
 private static final int M = 4; // max children per B-tree node = M-1
 private Node root; // root of the B-tree
 private int HT; // height of the B-tree
 private int N; // number of key-value pairs in the B-tree

 // helper B-tree node data type
 private static final class Node {
  private int m; // number of children
  private Entry[] children = new Entry[M]; // the array of children

  private Node(int k) {
   m = k;
  } // create a node with k children
 }
 // internal nodes: only use key and next
 // external nodes: only use key and value
 private static class Entry {
  private Comparable key;
  private Object value;
  private Node next; // helper field to iterate over array entries

  public Entry(Comparable key, Object value, Node next) {
   this.key = key;
   this.value = value;
   this.next = next;
  }
 }
 public BTree1() {
  root = new Node(0);
 }
 // return number of key-value pairs in the B-tree
 public int size() {
  return N;
 }
 // return height of B-tree
 public int height() {
  return HT;
 }
 // search for given key, return associated value; return null if no such key
 public Value get(Key key) {
  return search(root, key, HT);
 }
 private Value search(Node x, Key key, int ht) {
  Entry[] children = x.children;
  // external node
  if (ht == 0) {
   for (int j = 0; j < x.m; j++) {
    if (eq(key, children[j].key))
     return (Value) children[j].value;
   }
  }
  // internal node
  else {
   for (int j = 0; j < x.m; j++) {
    if (j + 1 == x.m || less(key, children[j + 1].key))
     return search(children[j].next, key, ht - 1);
   }
  }
  return null;
 }
 // insert key-value pair
 // add code to check for duplicate keys
 public void put(Key key, Value value) {
  Node u = insert(root, key, value, HT);
  N++;
  if (u == null)
   return;
  // need to split root
  Node t = new Node(2);
  t.children[0] = new Entry(root.children[0].key, null, root);
  t.children[1] = new Entry(u.children[0].key, null, u);
  root = t;
  HT++;
 }
 private Node insert(Node h, Key key, Value value, int ht) {
  int j;
  Entry t = new Entry(key, value, null);
  // external node
  if (ht == 0) {
   for (j = 0; j < h.m; j++) {
    if (less(key, h.children[j].key))
     break;
   }
  }
  // internal node
  else {
   for (j = 0; j < h.m; j++) {
    if ((j + 1 == h.m) || less(key, h.children[j + 1].key)) {
     Node u = insert(h.children[j++].next, key, value, ht - 1);
     if (u == null)
      return null;
     t.key = u.children[0].key;
     t.next = u;
     break;
    }
   }
  }
  for (int i = h.m; i > j; i--)
   h.children[i] = h.children[i - 1];
  h.children[j] = t;
  h.m++;
  if (h.m < M)
   return null;
  else
   return split(h);
 }
 // split node in half
 private Node split(Node h) {
  Node t = new Node(M / 2);
  h.m = M / 2;
  for (int j = 0; j < M / 2; j++)
   t.children[j] = h.children[M / 2 + j];
  return t;
 }
}
Also refer to http://algs4.cs.princeton.edu/62btrees/BTree.java.html
http://www.jbixbe.com/doc/tutorial/BTree.html
Insertion and Deletion
B-Trer Insertion
B-Tree Deletion
Read full article from B-Tree | Set 1 (Introduction) | GeeksforGeeks

No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts