Skip to main content

ScyllaDB in Action Book Excerpt: ScyllaDB, a Distributed Database

How does ScyllaDB provide scalability and fault tolerance by distributing its data across multiple nodes? Read what Bo Ingram (Staff Engineer at Discord) has to say – in this excerpt from the book “ScyllaDB in Action.” Editor’s note: We’re honored to share the following excerpt from Bo Ingram’s informative – and fun! – new book on ScyllaDB: ScyllaDB in Action. You might have already experienced Bo’s expertise and engaging communication style in his blog How Discord Stores Trillions of Messages or ScyllaDB Summit talks How Discord Migrated Trillions of Messages from Cassandra to ScyllaDB and  So You’ve Lost Quorum: Lessons From Accidental Downtime  If not, you should 😉 You can purchase the full 370-page book from Manning.com. You can also access a 122-page early-release digital copy for free, compliments of ScyllaDB. The book excerpt includes a discount code for 45% off the complete book. Get the 122-page PDF for free The following is an excerpt from Chapter 1; it’s reprinted here with permission of the publisher. *** ScyllaDB runs multiple nodes, making it a distributed system. By spreading its data across its deployment, it uses that to achieve its desired availability and consistency, which, when combined, differentiates the database from other systems. All distributed systems have a bar to meet: they must deliver enough value to overcome the introduced complexity. ScyllaDB, designed to be a distributed system, achieves its scalability and fault tolerance through this design. When users write data to ScyllaDB, they start by contacting any node. Many systems follow a leader-follower topology, where one node is designated as a leader, giving it special responsibilities within the system. If the leader dies, a new leader is elected, and the system continues operating. ScyllaDB does not follow this model; each node is as special as any other. Without a centralized coordinator deciding who stores what, each node must know where any given piece of data should be stored. Internally, Scylla can map a given row to the node that owns it, forwarding requests to the appropriate nodes by calculating its owner using the hash ring that you’ll learn about in chapter 3. To provide fault tolerance, ScyllaDB not only distributes data but replicates it across multiple nodes. The database stores a row in multiple locations – the amount depends upon the configured replication factor. In a perfect world, each node acknowledges every request instantly every time, but what happens if it doesn’t? To help with unexpected trouble, the database provides tunable consistency. How you query data is dependent on what degree of consistency you’re looking to get. ScyllaDB is an eventually consistent database, and you perhaps will see inconsistent data as the system converges toward consistency. Developers must keep this eventual consistency in mind when working with the database. To facilitate the various needs of consistency, ScyllaDB provides a variety of consistency levels for queries, including those listed in table 1.1. With a consistency level of ALL, you can require that all replicas for a key acknowledge a query, but this setting harms availability. You can no longer tolerate the loss of a node. With a consistency level of ONE, you require a single replica for a key to acknowledge a query, but this greatly increases our chances of inconsistent results. Luckily, some options aren’t as extreme. ScyllaDB lets you tune consistency via the concept of quorums. A quorum is when a group has a majority of members. Legislative bodies, such as the US Senate, do not operate when the number of members present is below the quorum threshold. When applied to ScyllaDB, you can achieve intermediate forms of consistency. With a QUORUM consistency level, the database requires a majority of replicas for a key to acknowledge a query. If you have three replicas, two of them must accept every read and every write. If you lose one node, you can still rely on the other two to keep serving traffic. You additionally guarantee that a majority of your nodes get every update, preventing inconsistent data if you use the same consistency level when reading. Once you have picked your consistency level, you know how many replicas you need to execute a successful query. A client sends a request to a node, which serves as the coordinator for that query. Your coordinator node reaches out to the replicas for the given key, including itself if it is a replica. Those replicas return results to the coordinator, and the coordinator evaluates them according to our consistency. If it finds the result satisfies the consistency requirements, it returns the result to the caller. The CAP theorem (https://en.wikipedia.org/wiki/CAP_theorem) classifies distributed systems by saying that they cannot provide all three of these properties – consistency, availability, and network partition tolerance, as seen in figure 1.8. For the CAP theorem’s purposes, we define consistency as every request reading the most recent write; it’s a measure of correctness within the database. Availability is whether the system can serve requests, and network partition tolerance is the ability to handle a disconnected node. Figure 1.8 The CAP theorem says a database can only provide two of three properties — consistency, availability, and partition tolerance. ScyllaDB is classified as an AP system. According to the CAP theorem, a distributed system must have partition tolerance, so it ultimately chooses between consistency and availability. If a system is consistent, it must be impossible to read inconsistent data. To achieve consistency, it must ensure that all nodes receive all necessary copies of data. This requirement means it cannot tolerate the loss of a node, therefore losing availability. TIP: In practice, systems aren’t as rigidly classified as the CAP theorem suggests. For a more nuanced discussion of these properties, you can research the PACELC theorem (https://en.wikipedia.org/wiki/PACELC_theorem), which illustrates how systems make partial tradeoffs between latency and consistency. ScyllaDB is typically classified as an AP system. When encountered with a network partition, it chooses to sacrifice consistency and maintain availability. You can see this in its design – ScyllaDB repeatedly makes choices, via quorums and eventual consistency, to keep the system up and running in exchange for potentially weaker consistency. By emphasizing availability, you see one of ScyllaDB’s differentiators against its most popular competition — relational databases.