Lloyd, Freedman, et al. detail a consistency model that is stronger than causal consistency, weaker than sequential consistency, yet scalable. The consistency model, causal + convergent conflict handling consistency, adds mechanisms to allow for conflicts to be handled in a convergent (can be merged) way. The authors also describe the types of applications that this consistency model is designed for: Available, low Latency, Partition tolerant, and Scalable (ALPS). Interestingly, the authors mention that convergent conflict handling isn’t a property that they first found, but that they have defined this strengthening of causal consistency in a way that is scalable.
It seems to me that the authors achieve scalability in their key-value store by actually using linearizability, and assuming no network partitions, within a data center, but keeping enough information so that consistency between data centers can, at least, satisfy causal+ consistency. I find it interesting that they do not specify this earlier in the paper, and it’s a detail that they let speak for itself as they describe their system. My impression is that this means that in the absence of network partitions and latency, their system should actually be linearizable, available, and scalable. Then, the weakest consistency they provide, under network partitions or high latency, is causal+, where operations occuring on either side of the partition maintain enough information so that recovery is able to merge state in a relatively straightforward manner.
While the 12 years later paper by Eric Brewer suggested an explicit recovery mode, it seems to me that COPS and COPS-GT simply maintain enough state that merging state is trivial in the common case, and fast in the case of recovery from a network partition. It seems clear that network partitions can only be identified after they have happened, so the necessary state needs to be tracked the whole time.
I wonder if there is any work in sensing when network latency is increasing or becoming less stable, and if that can be used to only track necessary state (for consistency) when a network partition is more likely to occur, rather than always maintaining state. This also makes me wonder if it would be possible to de-couple a metadata system from a storage system/data structure implementation in a way that the data store can maintain high scalability and performance and low complexity, while the metadata system acts as a monitoring system of sorts, and only tracks metadata for data (stored in the data store) at times when a network partition may occur. Then, the system can provide 2 data acccess paths: (1) standard access through the data store, (2) access via the metadata system that retrieves consistent data, or data that is currently undergoing recovery.