#2 High-Performance Transactions for Persistent Memories

07 Jun 2021

Link https://dl.acm.org/doi/10.1145/2872362.2872381
Year ASPLOS 2016

Problem

Atomic and durable transactions are a widely used mechanism to ensure data consistency on NVMs. However, naive implementations of NVM transaction is low performance due to the unnessesary persist dependencies.

Summary

The paper proposes defered commit transactions (DCT) to remove unnecessary persist dependencies enforced by the naive transaction implementation, synchronous commit transactions (SCT). Specifically, DCT defers the transaction’s commit step until after the lock is released. The paper then evaluations DCT across three memory persistency models and show its superior runtime over SCT.

Details

This paper builds upon ideas from the Memory Persistency paper by analyzing how to best utilize the persistency models when designing transactions. The paper makes the observation that “most of the unnecessary dependencies arise as a consequence of performing the commit step of the transaction while locks are held,” so naturally (for the reader, perhaps not obvious for the researchers when performing the studies), the proposed solution is to release the locks before commit.

The paper assumes that transactions use lock for concurrency control, but does not provide background on this topic. Maybe the readers are this expected to know this concept. My educated guess is that the NVM address space is divided into lock regions, each controlled by a lock. To modify data within a lock region, a thread must first obtain its lock (pretty much how regular volatile locks work).

The authors breaks down an undo transaction into five steps: acquire all locks (L), allocate and write to log (P), modify the data in NVM (M), commit transcation (C), release all locks (U). Only P, M, and C are persist operations, while L and U are volatile operations. The paper proceeds to describle the minimum set of persist dependencies between conflicting transactions (transactions that modifies overlapping data). First, P<M<C. This is true even with non-conflicting transactions. Second, if thread m acquires the conflicting lock(s) before thread n, then P_m < P_n and M_m < M_n and C_m < C_n. This is because we would like the data modification and log entry ordering to match the lock acquisition ordering (why?).

In this model, for x conflicting transactions, the persist critical path is x+2 persist operations.

The paper then review the three persistency models presented in Memory Persistency: strict, epoch, and strand. Furthermore, the authors attempt to formalize Intel’s x86 persistency model (the only commercially available NVM platform): eager sync. We can order two persists to A and B with the following: STORE A; CLWB A; SFENCE; PCOMMIT; SFENCE; STORE B; HOWEVER, the PCOMMIT instruction has been deprecated by Intel as per https://software.intel.com/content/www/us/en/develop/blogs/deprecate-pcommit-instruction.html. The original intent of PCOMMIT was to ensure data that reached the NVM memory controller’s write pending queue are flushed into persistent domain. This is because there was a concern that not many hardware platforms would support ADR, where the persistence domain includes the write queue. It turned out that many platforms planned to support ADR, so PCOMMIT is not needed anymore since data that reached the write queue is guaranteed to be persisted, even during a power failure. Thus, the latest code to order two persists is: STORE A; CLWB A; SFENCE; STORE B; where the CLWB ensures A is flushed to the NVM write queue. SFENCE ensures every store before the fence are globally visible before any stores after the fence.

The paper proceeds to show that SCT enforces unnessesary persist constraints even under relaxed persistence models. Under epoch persistency, we need to insert 4 epoch barriers to ensure correctness. I am not sure why the lock aquisition and release are a part of persist ordering constraints (see questions). Through bunch of equations, the paper shows that SCT results in serialized transactions under epoch persistency for both non-conflicting and conflicting transactions. SCT under strand persistency is a bit better, as non-conflicting transactions are allowed to execute independently, which is the ideal case. However conflicting transactions are still serialized.

DCT releases the lock before performing the commit step (commit deferred). This means that for conflicting transactions, P_n does not have to wait for C_m anymore. However, this means that the commit ordering is not guaranteed anymore (Eq 5), so the paper introduces new mechanism to ensure this. With epoch persistency, conflicting transactions under DCT now only require the minimal set of persist ordering constraints. I don’t understand the reasoning behind this. I think I need to first understand why SCT under epoch persistency works the way it does.

Thoughts and Comments

Questions