Reading Summary 2019-04-24

· by Aldrin Montana

Towards a Cloud Computing Research Agenda

Relevant Readings

Overall Evaluation

I very much like the overall message of this paper, even though the “traditional” directions of the research community seem clearly off-track. Of course, this reminds me of the database research community which focused on sensor networks far too early and missed out on important industry trends.

The paper begins with a definition of cloud computing that seems adequate, but I believe is too operational in definition, and not particularly principled. Below is a diagram of my own making that will be used as a reference for describing the given definition of cloud computing.

First, I will use the terms “core” and “surface” to distinguish the definitions of cloud computing from an outside perspective and from an inside perspective. The ascii cloud art comes from some ascii art website, which I annotated.

                  Surface
                  .-~~~-.
          .- ~ ~-(       )_ _
        /                     ~ -.
Surface| <------>  Core  <------>  \ Surface
        \       (tHe cLOuD)       .'
          ~- . _____________ . -~
                  Surface

Second, I provide a diagram that collapses the above idea into a 2-dimensional, layered layout:

                                  +---------+                +-------------+
##############   External         |         | Internal       |             |
# Client     #   Communication    | Surface | Communication  |  Core       |
# (outskirt) #   <------------>   | (edge)  | <------------> |  (central)  |
##############                    |         |                |             |
                                  +---------+                +-------------+

The cloud computing definition is given in 4 parts:

The first part describes what a cloud looks like from a client machine, cited from wikipedia.
The second part describes standards, or typical approaches, that developers expect of the external communication (how the client talks to some cloud servers at the edge).
The third part describes how the surface and core servers of the cloud are typically architected–how is system state managed (configuration), how are client data managed (storage systems and databases), how are the servers managed (hardware/virtual platform provisioning), and what is the communication substrate (the networking infrastructure).
The fourth part describes architecture and logic, provided by the core servers and infrastructure, for long term maintainability–how are tasks scheduled, how is storage managed (archival, etc.).

At first, the above definition seemed important, but reading the rest of the paper proved more interesting as it primarily discusses the difference in values between the industry and research communities and then discusses how the research community could pivot to more closely align with industry.

Strong Points

The paper is really well written to be easily read. This is not a research paper and it can be easy to poorly balance technical detail and a good description of the big picture. This paper goes into appropriate amounts of detail to understand exactly what differentiates industry values and research values, and what about the industry values holds research interest.
I think the running examples in the paper were very valuable for illustrating the authors’ points. Especially, the details regarding Chubby.

Weak Points

I did not like the overall message that the research community’s goals should align with industry’s and the other direction is less important. I do understand that as capitalist elites erode the common good and de-value overall research, the balance of funding sources tilt more towards corporations rather than the government or public, but I think there should have been a bit more discussed about how research directions should have influenced industry interests. Or how a greater effort could be made to connect with general users and improve the interactions and experience between application developers from the client perspective and the surface or core servers of cloud systems.
I think the mention of there being “no obvious value to ‘transparent online migration’” sounds fairly inaccurate. Contrary to what large companies may think, some systems have users that don’t want to be slowed down by technical issues that may accompany reboots, etc. For these systems, and for scenarios such as planned maintenance, transparent online migration could be very useful.

Questions Raised

I cannot fathom why, after the FLP impossibility, the research community would continue to pound away at consensus protocols. I thought it would be natural to transition into minimizing consistency requirements after the FLP impossibility was understood. But I suppose I’m also very biased, as I have already read lots of work on various consistency protocols.
If distributed systems research started at Bell Labs and other companies that were very influential before Microsoft and Google had built large distributed systems, how did the research community drift so far from the industry community? Is there still a distributed systems community gathering that spans both industry and research?