How Expedia Alleviated Cassandra’s Tuning Burden and Reduced Latencies
“We no longer have to worry about ‘stop-the-world’ garbage collection pauses. Also, we are able to store more data per node and achieve more throughput per node, thereby saving significant dollars for the company.”
- Singaram Ragunathan, Cloud Data Architect at Expedia Group
About Expedia
Expedia is one of the world’s leading full-service online travel brands helping travelers easily plan and book their whole trip with a wide selection of vacation packages, flights, hotels, vacation rentals, rental cars, cruises, activities, attractions, and services.
Expedia’s Database Use Case
The application provides information about geographical entities and the relationships between them. It aggregates data from multiple systems, like hotel location info, third-party data, etc. This rich geography dataset enables different types of data searches using a simple REST API with the goal of single-digit millisecond p99 read response time.
Expedia’s Cassandra Challenge: Burdensome Performance Tuning
The team was using a multilayered cache with Redis as a first layer and Cassandra as a second layer, but they grew increasingly frustrated with Cassandra’s technical challenges. Managing garbage collection and making sure it was appropriately tuned was a significant burden on their team. Also, burst traffic and workload peaks impacted the p99 response time – requiring buffer nodes to handle this peak capacity, which drives up infrastructure costs.
Singaram Ragunathan, Cloud Data Architect at Expedia Group, explained, “Apache Cassandra, written in Java, brings in the onus of managing garbage collection and making sure it is appropriately tuned for the workload at hand. It takes a significant amount of time and effort, as well as expertise required to handle and tune the GC pause for every specific use case.”
Moreover, Expedia’s burst traffic was leading to overprovisioning with Cassandra. “With burst traffic or a sudden peak in the workload there was significant disturbance to the p99 response time. So we ended up having buffer nodes to handle this peak capacity, which resulted in more infrastructure costs.”
These challenges led them to consider an alternate solution: ScyllaDB.
ScyllaDB is the #1 Apache Cassandra alternative. ScyllaDB provides the same CQL interface and queries, the same drivers, even the same on-disk SSTable format – but with a modern architecture designed to eliminate Cassandra performance issues, limitations, and operational barriers. ScyllaDB is built from the ground up in C++. No Java overhead. No garbage collection. And performance tuning? It’s automated.
Migrating from Cassandra to ScyllaDB
The team migrated from Cassandra to ScyllaDB without modifying their data model or application drivers. With “just a few tweaks” to their automation framework that provisioned Apache Cassandra clusters, they were able to provision ScyllaDB clusters.
Ragunathan explained, “From an Apache Cassandra code base, it’s frictionless for developers to switch over to ScyllaDB.”
Expedia’s Cassandra Migration Results
With Cassandra, p99 read latency was previously spiky, varying from 20 to 80ms per day. With ScyllaDB, it’s consistently around 5 ms. ScyllaDB throughput is close to 3x Cassandra’s. Moreover, ScyllaDB is providing 30% infrastructure cost savings.