#21 Scaling Memcache at Facebook

16 Oct 2021

Link https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf
Year NSDI 2013

Summary

This paper describes insights Facebook gained from modifying the open source version of memcached to build a distributed key-value cache at a global scale. It is interesting to see that similar to Redis, FB was willing to sacrifice consistency for performance.

Background

Look-aside vs inline caching

UDP vs TCP

Details

Interestingly, Facebook users generate an order of magnitude more read requests than write. Thus, Facebook’s memcache is read-optimized. Amazon is the opposite, which is write dominated.

Facebook uses memcache as a demand-filled look-aside cache, as shown in the figure below. Upon a read miss, the web server retrieves data from db and adds the entry to the cache. Upon a write, the server directly writes to the db and deletes the entry in cache. The paper says they delete instead of update because deletion are idempotent, which means f(x) == f(f(x)). I am guessing they mean they do not have to worry about data consistency.

The below figure shows FB’s memcache architecture. Each region is placed in a different geographical area. Within each region, there are multiple web server + memcache clusters and one shared storage cluster.

Within a Cluster

Two main design goals: reducing read latency and load due to cache miss.

Reducing read latency:

Reducing Load: Load here refers to database requests when cache misses. Three techniques:

  1. Leases
    • memcache instances give out leases to clients for the client to set data after it experienced a miss. This prevents the problem of stale sets, illustrated below [4]:
      key 'k' not in cache
      C1 get(k), misses
      C1 reads v1 from DB as the value of k
       C2 writes k = v2 in DB
       C2 delete(k)  (recall that any DB writes will invalidate key in cache)
      C1 set(k, v1)
      # now mc has stale data, since delete(k) has already happened
      # will stay stale indefinitely until k is next written.
      # (I believe even though k is deleted from the cache by C2, 
      # C1's set adds k back into the cache after not finding the key.
      

      Leases are bound with specific keys. Thus, k’s lease would prevent C1 from performing the stale set.

    • Leases also mitigate thundering herd, which happens when many clients attempt to read the same data for a key not present in the cache. This will cause heavy database traffic[4]. Leases prevent this by only allows one request of the same key every ex. 10 seconds.
    • FB also uses the idea of victim buffer in computer architecture to give recently deleted items a second chance. That is, upon a cache miss, it is acceptable for the client to receive a slightly stale data and avoid accessing the db.
  2. Memcache pools
    • Partition servers in a cluster into pools (each pool contains multiple servers). Use various pools to optimize keys with different characteristics. Ex. provision small pool for keys that have low miss costs.
  3. Replication within pools
    • Replicate a category of keys within a pool when the app fetches many keys in parallel and one server cannot handle the request rate.
    • FB chose replication over splitting the keys because… I don’t quite understand the motivation behind this decision.

    Used to improve latency and efficiency.

Single Server Optimizations

An interesting optimization is FB’s adaptive slab allocator. The memory is divided into slab classes, each of which will contain objects of a specific size range. FB dynamically identifies slab classes that require more memory and provides it with more space.

The original memcached lazily evicts expired cache items (existed longer than TTL) by only evicting them when it receives a get request for those items. This wastes memory for short lived keys. FB places short lived keys in a separate structure and proactively evicts them.

Questions

What does sacrificing slight consistency mean for the end user? I imagine this would look something like: Alice made a new post, but Bob cannot see it yet. After a couple of refreshes, the update appears.

Comments/Thoughts

Further Readings/Topics

Sources

[1] https://www.lifesize.com/en/blog/tcp-vs-udp/#:~:text=TCP%20is%20a%20connection%2Doriented,is%20only%20possible%20with%20TCP.

[2] https://www.reddit.com/r/ProgrammerHumor/comments/9gcwgw/tcp_vs_udp/

[3] https://conferences.sigcomm.org/co-next/2010/CoNEXT_papers/13-Wu.pdf

[4] https://timilearning.com/posts/mit-6.824/lecture-16-memcache-at-facebook/#architecture