Evolution and future directions of large-scale storage and computation systems at Google

22 Jan 2022

This post is my notes while going through [1]. see Sources section for all sources used in this post. I do not own any of the materials.

Note that this keynote was from 2010.

I start taking notes after the 38 minute mark, after the speaker finishes introducing mapreduce and BigTable. These specific tools will be studied in separate posts.

How can Raft afford to use a single master to serve all client requests? Does this not create a performance bottleneck? .

what are some typical read/write ratios for common distributed workloads? .

The Big Ideas

System Building Experiences

  1. Divide large systems into smaller services
    • Simpler sw dev: few dependencies (?), easy to update individual services, run lots of experiments (?), reimplement service without affecting clients (?)
    • Small teams can work independently
    • e.g. a google.com search uses 100s of services

    How exactly do services work? Why does services have these advantages?.

  2. Protocol description language (PDL) is a must
    • Allows a program written in one language to communicate with another program written in an unknown language by offering a language independent interface to the underlying data. Commonly used in remote procedure calls. [2]
    • Google’s PDL is called Protocol Buffers.
    • High performance, compact binary format. Can also store data persistently.
  3. Designing Systems
    • Critical skill: estimate performance of a system design
    • Prerequisites: solid understanding of the building blocks, aka. fundamentals.

      Some things to note

      • L1 is in the order of a couple of CPU cycle of 2GHz CPU
      • L2/L1 is 10^1, main memory/L2 is 10^2, disk/main mem us 10^5
      • Local network latency is on microsecond scale
      • For the same amount of data, local network is ~50x slower than main mem
      • DDR4=20GB/s, DDR3=10GB/s, HBM=200GB/s
      • SSD BW=15GB/s, latency=20us; NVM BW=5GB/s, latency=300ns
    • Practice back of the envelope calculations. (I wonder where I can find resources for this)
    • Understand common systems, on the implementation level. Ex. core language libraries, data structures, protocol buffers, GFS, BigTable, MapReduce etc.
    • Cannot perform effective back of the envelope calculations without solid knowledge of these basic components.

System Design Patterns

Single master, 1000s of workers

Canary requests

Tree Distribution of requests

Backup requests to reduce tail latency

Manage multiple piece of workloads per machine

Split data by Range, not Hash

Make systems elastic

Further Readings

Sources

[1] https://dl.acm.org/doi/10.1145/1807128.1807130

[2] https://en.wikipedia.org/wiki/Interface_description_language

[3] https://launchdarkly.com/blog/what-is-a-canary-release/

[4] By Forest and Kim Starr https://en.wikipedia.org/wiki/Domestic_canary#/media/File:YellowCanary.jpg