Taurus Parallel Logging

Most databases adopt write-ahead logging for fault tolerance, wherein transactions log undo and redo records to persistent storage as a single stream. After a crash, transactions are recovered following the order in the log. The single log stream limits scalability on today’s massively parallel processors. Taurus is a parallel logging algorithm that supports multiple log streams. For transactions to recover in the correct order after a crash, Taurus logs the order information compressed as a per-transaction vector of timestamps. The dimension of a vector equals the number of log streams. The vector can be further compressed when limited cross-stream dependencies exist. Taurus supports both data and command logging and is compatible with many concurrency control protocols.


Sam Madden, Xiangyao Yu