Corfu: A Platform for Scalable Consistency

by Michael Wei

Institution: University of California San Diego
Year: 2017
Keywords: Computer science; Computer engineering
Posted: 02/01/2018
Record ID: 2175744
Full text PDF: http://www.escholarship.org/uc/item/8g8078t2


Corfu is a platform for building systems which are extremely scalable, strongly consistent and robust. Unlike other systems which weaken guarantees to provide better performance, we have built Corfu with a resilient fabric tuned and engineered for scalability and strong consistency at its core: the Corfu shared log. On top of the Corfu log, we have built a layer of advanced data services which leverage the properties of the Corfu log. Today, Corfu is already replacing data platforms in commercial products, and the thriving Corfu open source code base enjoys regular contributions from a number of industrial and academic institutions.One of the key properties of Corfu is consistency, a highly desirable property which simplifies programming complex, asynchronous distributed systems by increasing the number of assumptions a programmer can make about how a system will behave. For years, system designers focused on providing the strongest possible guarantees on top of unreliable and even malicious systems. The rise of the Internet and cloud-scale computing, however, shifted the focus of system designers towards scalability. In a rush to meet the needs of cloud-scale workloads, system designers realized that if they weaken the consistency guarantees they provided, they would greatly increase the scalability of their systems. As a result, designers simplified the guarantees provided by their systems and weaker consistency models such as eventual consistency emerged, greatly increasing the burden on developers leading to error-prone applications. Programmers in the cloud era are forced to choose between consistency and scalability.Corfu, the topic of this dissertation, is a platform for scalable consistency. Corfu answers the question: ``If we were to build a distributed system from scratch, taking into consideration both the desire for consistency and the need for scalability, what would it look like?''. The answer lies in the Corfu distributed log. We begin by introducing the Corfu distributed log. Corfu achieves strong consistency by presenting the abstraction of a log clients may read from anywhere in the log but they may only append to the end of the log. The ordering of updates on the log is decided by a high throughput sequencer, which we show can handle nearly a million requests per second. The log is scalable as every update to the log is replicated independently, and every client appending to the log merely needs to acquire a token before beginning replication. This means that we can scale the log by merely adding replicas, and our only limit is the rate of requests the sequencer can handle.While building a single distributed log already provides strong consistency and scalability, multiple applications may wish to share the same log. By sharing the same log, updates across multiple applications can be ordered with respect to one another, which form the basic building block for advanced operations such as transactions. This dissertation details two designs for virtualizing the log: emph{streaming}, which