ScyllaDB vs Cassandra: 5 Minute Demo
This demo explores how ScyllaDB differs from Apache Cassandra in performance, elasticity, and capabilities such as workload prioritization. It shows how ScyllaDB shards data per CPU core, avoids garbage collection pauses, scales in parallel, and derisks topology changes—allowing it to handle 1M OPS with predictable low latencies and without constant tuning and babysitting. Learn more: https://www.scylladb.com/compare/scylladb-vs-apache-cassandra/
***
So people often refer to ScyllaDB simply as “a faster Apache Cassandra.” And despite the fact that we are faster, today I will prove you wrong.
So in the ScyllaDB Monitoring, we have a 3-node cluster doing 500K OPS while sustaining single digit millisecond latencies. In fact, I challenge you to achieve similar levels of performance with Apache Cassandra.
So unlike Apache Cassandra, ScyllaDB shards your data per CPU core, fully maximizing the available hardware resources. And latencies are sustained, given that our architecture doesn't suffer from “stop the world” garbage collection pauses.
So let's now run a scaling exercise and double our cluster size. With ScyllaDB, you add nodes in parallel. In Apache Cassandra, you add nodes one by one, and still have to run maintenance tasks such as cleanup after you are done. This means that doubling a ScyllaDB cluster takes just a few minutes while doubling an Apache Cassandra cluster can take hours or even days. This means that ScyllaDB is not just faster than Cassandra, but also easier to manage, while requiring less infrastructure to get the job done – directly saving your organization's budget.
So what we are going to see next is that even with lots of data movement going on, our latencies were barely affected. And here's another difference compared to Apache Cassandra: Scylla DB implements its own cache. This means you don't have to spend the time tuning your key cache, row cache, decide whether your memtable will be stored on heap, off heap, and so on. And ScyllaDB also knows how to keep your latencies under control, thanks to heat-weighted load balancing. Finally, ScyllaDB streams data as fast as possible, almost maximizing the available network throughput.
Given that we are finished with scaling, let's now actually get to 1M OPS. Can you do that with just 6 nodes with Apache Cassandra? So we scaled our workload to 1M now, and after a few iterations, that's it. We get to 1M OPS. It's still sustaining single-digit millisecond latencies.
Now, let's run a secondary workload. With ScyllaDB, you can define different priorities on a per-workload basis. We call that “workload prioritization.” Here, we define a real-time workload with 10X higher priority to system resources than the secondary workload. After we fire that up, we can see that our second workload latencies are higher than our real-time workload, while the real-time workload latencies were barely affected. Next, what we are going to do is…let's quickly recall that the ratio is 1:10, and let's go to the Advanced dashboard, where we can then observe that ScyllaDB indeed will dedicate 10X more CPU time to our real time workload, as opposed to the secondary one – just like we expected and just like we configured upfront.
So now that we understand how workload prioritization works, let's now scale the cluster back in. Maybe we are outside of peak hours and want to save on infrastructure. With a single click of the button, we get our application back to doing 500K OPS. And then, what we are going to do next is scale in the cluster. Unlike Apache Cassandra, ScyllaDB internally uses Raft to manage your state. It handles the entire decommission process for you “out of the box.”
So what does this mean? At the end of the day, topology changes in ScyllaDB are not only faster, but they are much easier to manage. They are also more reliable, and they require much less babysitting than what you currently have today with Apache Cassandra.
So before we wrap up, let me ask you one more time: Can you do this with Apache Cassandra?