ScyllaDB 5-Minute Demo

This narrated demo shows ScyllaDB's linear scalability and low latency. It demonstrates real-time data ingestion and processing, maintaining single-digit millisecond P99 latencies even as capacity doubles and traffic increases to 1M OPS.

***

There are many ways we can think about database performance. How much infrastructure is needed? How much will it cost me? How much throughput can it sustain? And under what latencies?

The following ScyllaDB Monitoring screen shows a small, 3-node cluster sustaining beyond 500K OPS, with single digit millisecond P99 latencies and microsecond average latencies.

What's even more impressive is that the cluster consumes only 70% of its total processing capacity with more to spare. Every CPU handles over 5.5K OPS.

And another angle we can think about database performance is: How fast we are able to scale to handle additional capacity, which is very important for seasonal and unpredictable periods of traffic. With ScyllaDB, scaling your cluster capacity is just as simple as adding more nodes to your cluster in parallel, just like what we've done here with a single click of a button. ScyllaDB will automatically balance the load in the background until it is perfectly balanced. What's more important is that this example takes less than 10 minutes to complete. Once that's done, we can observe that the new nodes started to take part of the traffic even before the scaling operation actually completed. In other words, if you have a surge in traffic, adding new nodes will help with balancing the load until it eventually converges.

Also note the system load. It went down from 70% to 50% utilization. What about latencies? Latencies were kept predictably low, but fluctuated slightly. This fluctuation can easily be explained by taking a look into the cache section within the ScyllaDB monitoring. New replicas joined the cluster with a cold cache and requests to them took longer. To prevent latency spikes, ScyllaDB’s heat-weighted load balancing guarantees that requests are routed to primary replicas with a warm cache and only a small fraction of traffic hits the cold replicas.

Switching over to the Operating System metrics, we observe that during the scaling operation network throughput achieved close to 1.5 GBps. That's almost 12 Gbit/s, which is really close to the maximum throughput our NICs can sustain.

Alright, so we scaled the cluster and observed how the system reacted with additional capacity. It's now time for us to scale traffic. With only 6 nodes, let's reach 1M OPS.

Look how the system behaves during this scaling exercise. Within the cluster view, we can observe 1M OPS with latencies within 3 milliseconds per request – exactly the same numbers as prior to the scale out operation.

Let's now run another workload in parallel to see how our 6-node cluster behaves. Before we do, we define another service level for it. A service level is a logical abstraction allowing the user (or you) to specify different priorities on a workload basis. In this example, our 1M OPS workload gets scheduled within the real-time service level, and we started a secondary workload with 10X less priority than our real time workload. So, as we can see after some time in the Monitoring, we start to observe the yellow line showing higher latencies than our real time workload.

And now let's take a look into that advanced dashboard and see how the CPU utilization compares across those two workloads. What we can see is that ScyllaDB is dedicating 10X more CPU time to our real time workload compared to our secondary workload, which gets only 1/10 of the CPU time.

So we finally got to the last performance angle for today, which is: How fast can we scale down our database, save on cost to minimize infrastructure spend and avoid being over-provisioned. We ramp down traffic, and right after, we start to downscale our infrastructure back to where it was when we started. Next, we start removing nodes in parallel, and then the reverse process begins. Nodes leaving the cluster start streaming data to the nodes which are remaining, and traffic gets automatically redirected with minimal impact to your running operations. This entire process takes less than 20 minutes. And basically, folks, this is how fast ScyllaDB is.