CMPS 290S Reading Response: Time, Clocks, and the Ordering of Events in a Distributed System

This is a repository for my research, paper reading summaries/reviews, and relevant blog-like posts in markdown.

CMPS 290S Reading Response: Time, Clocks, and the Ordering of Events in a Distributed System

Summary

In this paper, Leslie Lamport wants to re-frame, and formalize, what it means for some event A to occur before event B. He starts with the insight that, with respect to time, events occur relatively. Because this notion is so ingrained in human thinking, it pollutes how we think of events in a distributed system with respect to time. Lamport first proposes the use of logical clocks for totally ordering events within a process, and partially ordering events across processes. Then, to address the scenario where events external to the system have information needed for a more correct ordering of events within the system, he extends his proposal to use physical clocks.

Learning and Understanding

It is mentioned that physical clocks will always drift apart, and when adjusting a physical clock time, it can only move forward. In one way this seems obvious, setting a clock back means it is ``replaying’’ a period of time. In another way, this is interesting because it means that clock drift is handled by moving slow clocks forward. This necessitates an order of clocks to move forward, otherwise a slow clock could be updated to be ahead of another clock, which could violate some of the clock conditions. Updating clocks at all could be very tricky to balance with calculation of the minimum delay Mu_m.

I am really not sure what the formal distinction is between the clock condition'' and thestrong clock condition’’. It seems really subtle, though I do notice that there’s a difference in the context in which two events, a and b, are considered. But the clock condition says any two events, a and b, whereas the strong clock condition says any two events, a and b, in a given set of events. Perhaps the ``strong’’ qualifier refers to how strong the condition is within (or for) a system, not how strong the conditions are compared to each other? My best guess is that the strong clock condition defines a context (the set of all system events) and so the condition is stronger because it has context; whereas the clock condition says for any two events, but Lamport says there may be anomalous behavior related to events outside of the system. So, I’m not sure, but I think the distinction is subtle and I would be curious if the strong clock condition is more applicable, or simply reflects what is practical knowledge?

Research Question and What to Investigate

In thinking about vector clocks from Monday (2018-10-08) and how vector clocks are dense representations of logical clocks for a distributed system, it occurred to me that lamport clocks are incredibly flexible, even if weak. For large systems, I feel that vector clocks would be hard to implement, but in conjunction with Lamport clocks it may be possible, even useful, to treat some cluster of machines as having a single lamport clock which is used when communicating between clusters, and within the cluster vector clocks may be used. This further makes me wonder if lamport clocks are used at multiple granularities due to minimal needs for information, and I wonder at what granularities lamport clocks may be much more preferred to vector clocks.

I think there would be some difficult subtleties in implementing a system that combines lamport clocks with vector clocks, so the easiest way to investigate this idea would be to look at research about large systems. Likely monitoring systems are the most relevant, as monitoring systems may monitor many disparate systems that run on different clusters (at least virtually), and yet monitoring systems need to have high level knowledge of all of the systems and how they interact. Similarly, orchestrators may be adequately removed from any one system to also have similar research.