Sizing Considerations for ScyllaDB
Thank you for choosing ScyllaDB, the monstrously fast and scalable database. Here are some guidelines for you to select the appropriate infrastructure and topology for your workload:
A ScyllaDB cluster is a collection of nodes, visualized as a ring. The ring of a ScyllaDB cluster is known as the "token ring". Each node owns a portion of it in order to balance the data across the cluster. For example, in the diagram presented, the token ring is divided into 3 ranges:
- 1 to 400
- 401 to 800
- 801 to 0
Each of these ranges are called "token ranges" and represent a fraction of the entire token ring.
In reality, ScyllaDB further extends the previous concept with the notion of Vnodes (or Virtual Nodes), which break up the available range of tokens into smaller ranges. In the picture, all red circles represent the Vnodes for a single node. Similarly, all yellow circles represent the Vnodes for a different node, and so on. By default, ScyllaDB assigns a total of 256 token ranges per node in the token ring. As a result, the more nodes we have, the more Vnodes we will have in the token ring, and the data and workload will be balanced to all cluster members. This is how ScyllaDB achieves unlimited scalability.
ScyllaDB achieves high availability via a setting known as the "Replication Factor". The Replication Factor determines how many times the data should be replicated to other nodes, in addition to the node which owns the token for a particular piece of data, as we have seen. In the diagram, we can see the client sending a write operation to a cluster composed of 5 nodes. As the replication factor is set to 3, the data will be replicated to 3 different nodes in the cluster. A replication factor of 3 is what we recommend for production purposes.
To understand why a replication factor of 3 is recommended, let's go through a simple exercise. As a node in the cluster fails, you will still have a majority of the cluster up and running. Thus, still allowing the application to read and write with a Consistency level of Quorum, and guaranteeing consistent reads and writes. At this point, you should probably have realized that there is something going on with the number 3. In case you are wondering: Yes, a ScyllaDB cluster of 3 nodes is the minimum we recommend, as it ensures data consistency, as well as high availability.
One fundamental aspect of ScyllaDB that you should be aware of is its shard per core architecture. Remember the Vnodes – the coloured dots per node – that we saw earlier? Those dots were essentially how ScyllaDB automagically shards your data.
However, ScyllaDB takes sharding one step further: The data is not only sharded by nodes, as most regular databases do, but also by CPU cores! This means that every Vnode is broken down into shards, which are – in fact – your CPU cores. Speaking of CPU cores, we estimate that a single physical core is able to deliver 12.5K operations per second considering a payload of 1 KB. Of course, this number will vary upon a variety of factors, such as – for example – your payload size.
A general guidance when creating your cluster is to ensure a well-balanced configuration. Let's take a look at each of the main infrastructure components for your cluster: Networking, CPUs, RAM and Storage:
- A fast and low-latency Network link is required both from an application's perspective as well as for cluster internal communication activities, such as data replication. We recommend a network of 10 Gbps or more, which should be commonly available nowadays.
- ScyllaDB shard-per-core architecture allows for linear scalability of your nodes. Although there are no hard requirements on the number of cores for your cluster, remember to consider the expected number of operations per second, and to factor in for growth.
- Memory plays a crucial role in your cluster performance. ScyllaDB implements its own cache subsystem to cache both your writes and reads. The more memory you have, the larger your cache is going to be, and the fewer round trips to disk will guarantee lower latencies.
- As a low-latency database, we highly recommend SSDs and local disks. Two aspects are important when selecting the ideal amount of Storage:
- First, ensure that you maintain a reasonable ratio of memory to storage. A 1:30 ratio is a good starting point, but we definitely do NOT recommend going past a 1:100 ratio. For example, don't try to store 1 Terabyte on top of a server running 1G of memory, as that would be a ratio of 1:1000.
- Second, remember to leave free storage space for internal database operations, such as compaction, backups, repairs, and for regular application growth
Here's a quick formula in case you are in doubt on how much Storage you need per node. It goes like this: We take the replication factor and divide it by the number of nodes, and then multiply the result by our dataset size. The result will approximately tell us how much data will be owned per node. For example:
If we consider our recommended RF of 3;
And a cluster of 6 nodes;
And an unreplicated dataset of 9 TB;
Then every node part of our cluster is going to own approximately 4.5TB
Another formula which gives the same results, goes like this: We multiply the dataset size by the replication factor, and divide it by the number of nodes. The result of this formula is going to be the same as the previous one. As a result of these formulas, if we factor in growth and internal database operations, we could start with nodes with around 8 or 9 Terabytes of Storage each. Taking the Memory to Storage ratio into consideration, dividing the resulting 4.5TB by 100 would result in 45GB of RAM, in such a way that 64G is more than enough
Here's a wrap up of everything we have seen:
- At a minimum, a ScyllaDB cluster should contain 3 nodes
- A replication factor of 3 provides great balance between high availability and consistency
- A single physical core can approximately deliver 12.5K ops/s, considering a payload of 1KB
- Remember to keep a well-balanced configuration:
- Be mindful of your memory:storage ratios
- Remember that more memory essentially means larger cache, and fewer round trips to disk
- When selecting Storage, prefer locally attached, low-latency ones:
- Use the provided formulas to estimate how much data is expected to be utilized per node
- After, remember to account for growth and save room for database internal operations
- As a rule of thumb, targeting approximately 50% utilization should be a good starting point if you are unsure
We hope these instructions have been helpful to you, and in case you have any questions, please contact us! Thank you!