Google's Bigtable Distributed Storage System, Pt. I



Google’s Bigtable Distributed Storage System, Pt. I

Google rolls out new applications to millions of users with surprising frequency, which is pretty amazing all by itself. Yet when you look at the variety of the applications, ranging from data-sucking behemoths like webcrawling to intimate apps like Personalized Search and Writely it is even more startling. How does the Google architecture manage the conflicting requirements of such a wide range of workloads? Bigtable, a Google-developed distributed storage system for structured data, is a big piece of the answer.

Isn’t The Google File System The Answer?
Another part, yes. And like a fractal, we see some of the same patterns repeating. Which isn’t too surprising, for two reasons. First, all Google apps need to scale way beyond what most commercial systems ever consider. Second, Sanjay Ghemawat, a Google Fellow and former DEC researcher whose long-time interests include large-scale, safe, persistent storage, is a designer of not only Bigtable, but of the Google File System and MapReduce, Google’s tool for processing large data sets. The man’s got big on the brain and he’s in the right playpen.

If It’s a Storage System, Where Are The Disks?
Don’t be so literal-minded. For these guys “storage” is where in data space you put the data. Not only disks, but data structures and flows. Lock management. Data layout. And more. The disks are there, under GFS and the local OS on the servers.


Read full article from Google’s Bigtable Distributed Storage System, Pt. I


No comments:

Post a Comment

Labels

Algorithm (219) Lucene (130) LeetCode (97) Database (36) Data Structure (33) text mining (28) Solr (27) java (27) Mathematical Algorithm (26) Difficult Algorithm (25) Logic Thinking (23) Puzzles (23) Bit Algorithms (22) Math (21) List (20) Dynamic Programming (19) Linux (19) Tree (18) Machine Learning (15) EPI (11) Queue (11) Smart Algorithm (11) Operating System (9) Java Basic (8) Recursive Algorithm (8) Stack (8) Eclipse (7) Scala (7) Tika (7) J2EE (6) Monitoring (6) Trie (6) Concurrency (5) Geometry Algorithm (5) Greedy Algorithm (5) Mahout (5) MySQL (5) xpost (5) C (4) Interview (4) Vi (4) regular expression (4) to-do (4) C++ (3) Chrome (3) Divide and Conquer (3) Graph Algorithm (3) Permutation (3) Powershell (3) Random (3) Segment Tree (3) UIMA (3) Union-Find (3) Video (3) Virtualization (3) Windows (3) XML (3) Advanced Data Structure (2) Android (2) Bash (2) Classic Algorithm (2) Debugging (2) Design Pattern (2) Google (2) Hadoop (2) Java Collections (2) Markov Chains (2) Probabilities (2) Shell (2) Site (2) Web Development (2) Workplace (2) angularjs (2) .Net (1) Amazon Interview (1) Android Studio (1) Array (1) Boilerpipe (1) Book Notes (1) ChromeOS (1) Chromebook (1) Codility (1) Desgin (1) Design (1) Divide and Conqure (1) GAE (1) Google Interview (1) Great Stuff (1) Hash (1) High Tech Companies (1) Improving (1) LifeTips (1) Maven (1) Network (1) Performance (1) Programming (1) Resources (1) Sampling (1) Sed (1) Smart Thinking (1) Sort (1) Spark (1) Stanford NLP (1) System Design (1) Trove (1) VIP (1) tools (1)

Popular Posts