In today’s post I want to highlight the brilliance of the Google Research team, their ability to step back and look at old assumptions kind of reminds me of the Wright brothers realizing that lift values from the 1700’s and other widespread assumptions of the time were the main constrains holding them back from being able to come with the first airplane.
At Google Research something similar went on when they realized that traditional data storage and processing paradigms did not fit well with their application’s processing workloads. Here are some of the design assumptions for Google File System straight from the published research paper with my comments:
- Failure is an expectation, not an exception
Google realized that the traditional way to address failure on the datacenter is to increase the sophistication of the hardware platforms involved. This approach increases cost both by using highly specialized hardware and by requiring system administrators with very sophisticated skills. The main innovation here is realizing that when dealing with massive datasets (i.e. downloading a copy of the entire web) hardware failure is a fact of life rather than an exception; once this observation is incorporated into their design costs can be decreased by storing and processing data on very large clusters of commodity hardware where redundancy and replication across processing nodes and racks allows for seamless recovery from hardware failure.
- The system stores a modest number of large data files
This observation is arrived at by looking at the nature of the data being processed such as HTML markup from crawling a large number of websites, this is what we would call “unstructured data” that is cleaned and serialized by the crawler before it is “batched” together into large files. Once again, by taking a step back and looking at the problem with fresh eyes the researchers were able to realize their design did not need to optimize for the storage of billions of small files, this is a great constraint to remove from their design as we will explore when we look at the ability of the GFS master server to control and store metadata for all files in a cluster in memory, thus allowing it to make very smart load balancing, placement and replication decisions.
- Workloads primarily consist of large streaming reads and small random reads
By looking at actual application workloads the researchers found that they could generally group read operations in these two categories and that sucessive read operations from the same client will often read contiguous regions of a file; also, performance minded applications will batch and sort their reads so that their progress through a dataset is one directional moving from beginning to end instead of going back and forth with random I/O operations.
- The workloads also have many large, sequential writes that append to data files
Notice here how “delete” and “update” operations are extremely rare to non-existent, this frees up the system design from the onerous task of maintaining locks to ensure the atomicity of these two operations.
- Atomicity with minimal synchronization is essential
The system design focuses on supporting large writes by batch processes and “append” operations by a large number of concurrent clients, freeing itself from the constraints mentioned on the previous point.
- High sustained bandwidth is more important than low latency
A good observation on the fact that when dealing with these large datasets most applications are batch oriented and benefit the most of high processing throughput versus the traditional database application that places a premium in fast response times.
In hindsight, these observations might seem obvious, specially as they have been incorporated into the design principles that drive other products such as Apache Hadoop; but, Google’s decision to invest into a custom made file system to fit their very specific needs and the ability of the Google Research team to step back and start their design with fresh eyes have truly revolutionized our data processing forever, cheers to them!
Reference:
“The Google File System”; Ghemawat, Gobioff, Leung; Google Research