CMPS 278 Paper Evaluation: F1 - A Distributed SQL Database That Scales

Review

At one point, Spanner was regarded as the most consistent, most available distributed database management system that made people wonder if CAP was still a 2-of-3 theorem. F1 is a relational layer on top of Spanner that handles query processing and data model specification. I think even these days it is very uncommon for database systems to implement a single layer and interact with other database systems which implement lower-level functionality. The only other instance of this that I can think of is the replacement of mysql’s storage engine by Facebook using the rocksdb storage engine, but even this was a modification of another database system, instead of a database system designed and implemented to handle only the relational and query processing and another one to handle storage and low level mechanisms (logging, etc.).
For two database systems that were designed in close collaboration, I find it interesting that data partitioning is only done via hash partitioning and not by range partitioning at all, due to Spanner’s implementation. Not only that, but I wonder if the interleaving of data rows in hierarchical tables negatively interacts (is sub-optimal) with the data partitioning for distributed queries.