Google Bigtable Introduction - Tutorial



Google Bigtable Introduction - Tutorial

Google uses as a data storage a facility called Bigtable. Bigtable is a distributed, persistent, multidimensional sorted map. Bigtable is not a relational database. In Bigtable you can store strings under an index which consists out of a row key, a column key and a timestamp. This key points to a uninterpreted array of bytes (string) of size 64 KB.

(row key: type string, column key:type string, timestamp: type int64) -> string

The key can get generated by the database or by the application.

For example in the Google Webtable (for Google search) the reverse URL is used as the row key, the column used for different attributes of the webpage and the timestamp indicates from then the data is. The data this key points to is some content from the webpage.

Bigtable is build upon the Google File System and stored in an immutable datastructure called SSTable. The application can define how many entries based on the timestamp should be keep. Alternatively the application can also specify how long entries should be keep. Bigtable will clean-up the obsolete data by deleting the SSTables which only contains irrelevant data using a mark-and-sweep algorithm.


Read full article from Google Bigtable Introduction - Tutorial


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts