Geo-replicated, distributed data stores that support complex online applications, such as social networks, must provide an “always-on” experience where operations always complete with low latency. Today’s systems often sacrifice strong consistency to achieve these goals, exposing inconsistencies to their clients and necessitating complex application logic. Theoretical results show that the strongest consistency models (linearizability and sequential consistency) are fundamentally incompatible with our other goals, so we identify a new consistency model — causal consistency with convergent conflict handling, or causal+ — that is compatible.
Our system, COPS, is a key-value store that delivers this consistency model across the wide-area. A key contribution of COPS is its scalability, which can enforce causal dependencies between keys stored across an entire cluster, rather than a single server like previous systems. Scaling the size of each cluster brings causal consistency to a new set of applications, such as social networks like Facebook and Twitter, whose data is far too large and query rates far too high to be handled by single machine clusters.
The central approach in COPS involves explicitly tracking and enforcing causal dependencies between updates. For instance, if you upload a photo and add it to an album, the album update “depends on” the photo addition, and should only be applied after it. Writes in COPS are accepted by a local datacenter that then propagates them to other, remote, datacenters. These remote datacenters check that all dependencies are satisfied by querying other nodes in the cluster before applying writes. This approach differs from traditional causal systems that exchange update logs between replicas. In particular, the COPS approach avoids any single serialization point to collect, transmit, merge, or apply logs. Avoiding single serialization points is a major factor in enabling COPS to scale to large cluster sizes.
Even though COPS provides a causal+ consistent data store, it is impossible for clients to obtain a consistent view of multiple keys by issuing single-key gets. (This problem exists even in linearizable systems.) In COPS-GT, we enable clients to issue get transactions that return a set of consistent values. Our get transaction algorithm is non-blocking, lock-free, and takes at most two rounds of intra-datacenter queries. It does, however, require COPS-GT to store and propagate more metadata than normal COPS.
Our evaluation shows that COPS completes operations in less than a millisecond, provides throughput similar to previous systems when using one server per cluster, and scales well as we increase the number of servers in each cluster. It also shows that COPS-GT provides similar latency, throughput, and scaling to COPS for common workloads.
Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS
Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, David G. Andersen
Proc. 23rd ACM Symposium on Operating Systems Principles
(SOSP ’11) Cascais, Portugal, October 2011. [ pdf | ps ] [slides] [video] [poster]