CMPS 290S Reading Response: Linearizability: A Correctness Condition for Concurrent Objects

Summary

Herlihy and Wing formally define a consistency model, linearizability, and then explore some properties and scenarios of linearizable operations on a given shared data structure. The consistency model seems to be motivated by wanting to define consistent operations from a data-centric perspective, which improves concurrent performance in a few aspects: scope and latency (reducing latency by reducing blocking operations). This paper seems to be very useful, though, in how it formalizes linearizability. The formalizations seem very expressive and powerful. Having just learned about compactness theorems, it also seems extremely useful to have a formalism that defines compactness for locality and blocking properties of the consistency model. Further, it compares linearizability to other consistency models, and in so doing defines transactions and processes as different types of “threads of control”, and thus gives a reasonable grounding for understanding consistency models over transactions compared to processes.

Learning and Understanding

I originally thought that the relationship between linearizability and serializability was based on ordering of operations outside, or inside, of a transaction scope, respectively. But, I think I now realize that the relationship is actually based on the scope of operations (or history subsequence) that may fail or be aborted. For linearizability, operations individually succeed or fail, and are only considered with respect to pre- and post-conditions on a shared data object at the time of operation invocation. On the other hand, serializability allows for a sequence operations to succeed or fail with respect to (logically) pre- and post-conditions on a shared data object.

While I understand the property of non-blocking, I’m not clear on the larger implications of this property. It seems that linearizability is non-blocking in the sense that invocations are not dependent on a preceding response. This feels like a weak definition of non-blocking, in the sense that it is most useful for writes, where “important state” is local to the shared data object. Whereas, for non-mutating reads (a peek vs a dequeue, for example), “important state” is local to the process. So in this case, the implication of non-blocking for invocation and response seems misleading?

Research Question and What to Investigate

Linearizability, and potentially consistency in general, seems to usually take perspective on a particular data object, which is shared. I don’t get the sense that where that data object is replicated is considered, and it doesn’t seem like data flow, or where state gets propagated, is mentioned either. It makes me wonder what research has been to trace the flow of state of a shared data object (consistency of a data object) to replicas of partial data objects (consistency of data object as it is read by a remote server/machine or is replicated to remote locations). I suppose that maybe consistency is a separate property from replicas, but I wonder if formalisms on consistency are applicable to understanding replication?

Considering the relationship between consistency and availability, if one poses replication as how available a shared data object is, perhaps it’s possible to use probabilities over the formal definition of linearizability to understand both the consistency and availability aspects of linearizability and other consistency models. Perhaps understanding this extension of these formalisms would be searchable under markov logic for consistency models, or something that could be intuitively explored? I imagine formally exploring markov logic would be difficult, and not a good first approach to exploring this research question.