Google’s Bigtable Distributed Storage System, Pt. I
Google rolls out new applications to millions of users with surprising frequency, which is pretty amazing all by itself. Yet when you look at the variety of the applications, ranging from data-sucking behemoths like webcrawling to intimate apps like Personalized Search and Writely it is even more startling. How does the Google architecture manage the conflicting requirements of such a wide range of workloads? Bigtable, a Google-developed distributed storage system for structured data, is a big piece of the answer.
Isn’t The Google File System The Answer?
Another part, yes. And like a fractal, we see some of the same patterns repeating. Which isn’t too surprising, for two reasons. First, all Google apps need to scale way beyond what most commercial systems ever consider. Second, Sanjay Ghemawat, a Google Fellow and former DEC researcher whose long-time interests include large-scale, safe, persistent storage, is a designer of not only Bigtable, but of the Google File System and MapReduce, Google’s tool for processing large data sets. The man’s got big on the brain and he’s in the right playpen.
If It’s a Storage System, Where Are The Disks?
Don’t be so literal-minded. For these guys “storage” is where in data space you put the data. Not only disks, but data structures and flows. Lock management. Data layout. And more. The disks are there, under GFS and the local OS on the servers.
Read full article from Google’s Bigtable Distributed Storage System, Pt. I
No comments:
Post a Comment