Scalable Causal Consistency


Don’t Settle for Eventual: Causal Consistency for Wide-Area Storage with COPS

Geo-replicated, distributed data stores that support complex online applications, such as social networks, must provide an “always-on” experience where operations always complete with low latency. Today’s systems often sacrifice strong consistency to achieve these goals, exposing inconsistencies to their clients and necessitating complex application logic. In this paper, we identify and define a consistency model–causal consistency with convergent conflict handling, or causal+–that is the strongest achieved under these constraints.

We present the design and implementation of COPS, a key-value store that delivers this consistency model across the wide-area. A key contribution of COPS is its scalability, which can enforce causal dependencies between keys stored across an entire cluster, rather than a single server like previous systems. The central approach in COPS is tracking and explicitly checking whether causal dependencies between keys are satisfied in the local cluster before exposing writes. Further, in COPS-GT, we introduce get transactions in order to obtain a consistent view of multiple keys without locking or blocking. Our evaluation shows that COPS completes operations in less than a millisecond, provides throughput similar to previous systems when using one server per cluster, and scales well as we increase the number of servers in each cluster. It also shows that COPS-GT provides similar latency, throughput, and scaling to COPS for common workloads.


Stronger Semantics for Low-Latency Geo-Replicated Storage with Eiger

We present the first scalable, geo-replicated storage system that guarantees low latency, offers a rich data model, and provides “stronger” semantics. Namely, all client requests are satisfied in the local datacenter in which they arise; the system efficiently supports useful data model abstractions such as column families and counter columns; and clients can access data in a causally-consistent fashion with read-only and write-only transactional support, even for keys spread across many servers.

The primary contributions of this work are enabling scalable causal consistency for the complex column-family data model, as well as novel, non-blocking algorithms for both read-only and write-only transactions. Our evaluation shows that our system, Eiger, achieves low latency (single-ms), has throughput competitive with eventually-consistent and non-transactional Cassandra (less than 7% overhead for one of Facebook’s real-world workloads), and scales out to large clusters almost linearly (averaging 96% increases up to 128 server clusters).



Don’t Settle for Eventual Consistency: Stronger properties for low-latency geo-replicated storage
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen
Communications of the ACM
(CACM) Vol. 57, No. 5, May 2014
(Also in ACM Queue) Vol 12, No. 3, March 2014.
[ pdf ]

A Short Primer on Causal Consistency
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen
;login: The USENIX Magazine
;login:) Vol 38, Number 4, August 2013.
[ pdf ]

Stronger Semantics for Low-Latency Geo-Replicated Storage
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen
Proc. 10th Symposium on Networked Systems Design and Implementation
NSDI ’13) Lombard, IL, April 2013.
[ pdf ] [ ps ] [ video]

Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen
Proc. 23rd ACM Symposium on Operating Systems Principles
(SOSP ’11) Cascais, Portugal, October 2011.
[ pdf ] [ ps ] [slides] [video] [poster]


Related Links

High Scalability Post on COPS
High Scalability Followup Post
Read Write Web Post on COPS
Longer answers to questions asked at the talk