This is a repository for my research, paper reading summaries/reviews, and relevant blog-like posts in markdown.

What is the problem the authors are trying to solve?

The authors were trying to improve utilization of disk bandwidth for data writes, and at the same time minimizing the time required for crash recovery.

What other approaches or solutions existed at the time that this work was done?

An alternate approach mentioned in the paper is the Unix FS, which the authors identify as having efficiency problems.

What was wrong with the other approaches or solutions?

The main problem that is discussed regarding the Unix FS is that write operations require many seeks and communications with the disk which necessarily places a limit on how performant storage access can be. The authors mention that at least 5 I/O operations are required for a new file to be created.

What is the authors’ approach or solution?

The authors take the approach to rely heavily on file cache for data writes in order to amortize small file accesses and also to overall minimize communications with the disk.

Why is it better than the other approaches or solutions?

This approach greatly improves write throughput and is relatively simple. The difficulty is in 2 specific problems which are mentioned to be: keeping space available on disk for the next buffered write, and how to efficiently read data from disk.

How does it perform?

Comparing Sprite LFS (Log-structured File System) to SunOS (file system is based on Unix FS), the author’s method vastly outperforms a Unix FS-based file system on file creation and file deletion, and only has a ~30% increase in read throughput (based on files/sec, not data transferred). In locality experiments (write, read sequentially; write, read randomly; read sequentially), Sprite LFS performs better in write operations (30% and 100% improvement in sequential and random writes, respectively) but performs about evenly in read performance except when data is written randomly and read sequentially.

Why is this work important?

This work has proven extremely important in the change it has affected in the database and storage communities of looking at data storage as a log. This has given rise to the log-structured merge tree, which is now common in modern databases. At the time, I believe it was important for the paradigm shift, but even more so because of the performance improvement for workloads requiring higher write throughput.

This paper also makes the argument that large file accesses are more constrained by hardware (at least, at the time) than by the file system, and so by focusing on small file accesses the file system can have a more noticeable impact on users than by focusing on large file accesses, which were far less common at the time.

3+ comments/questions

  1. With improvements and ubiquity of media files (photos, videos, rendered graphics), I wonder if there is a benefit for log-structured file systems or log-structured merge trees to improve throughput/management of large files, which the authors did not think was important at the time.

  2. I wonder if a hybrid approach was attempted around the time shortly after this paper, where developers of the Unix FS (or the FS used in SunOS) could have leaned more heavily on the file cache to amortize file access, but keep a similar scheme of methodically placing files on disk for improved access. In this way, I/O operations may have been greatly reduced while retaining some benefits. Although, I suppose that from the benchmarks there weren’t many benefits of traditional approaches over the log-structured FS anyways.

  3. I don’t think that ext is a FS that is log-structured, yet I think that ext is likely far more common than log-structured file systems for consumer installations. I wonder if there was a resistance to adopting a log-structured approach given that this would require lots of change, but consumers perhaps never needed the performance boost to justify the transition.