Introducing ScyllaDB X Cloud: A (Mostly) Technical Overview

Tzach Livyatan

ScyllaDB X Cloud just landed! It’s a truly elastic database that supports variable/unpredictable workloads with consistent low latency, plus low costs.

The ScyllaDB team is excited to announce ScyllaDB X Cloud, the next generation of our fully-managed database-as-a-service. It features architectural enhancements for greater flexibility and lower cost. ScyllaDB X Cloud is a truly elastic database designed to support variable/unpredictable workloads with consistent low latency as well as low costs.

A few spoilers before we get into the details:

You can now scale out and scale in almost instantly to match actual usage, hour by hour. For example, you can scale all the way from 100K OPS to 2M OPS in just minutes, with consistent single-digit millisecond P99 latency. This means you don’t need to overprovision for the worst-case scenario or suffer latency hits while waiting for autoscaling to fully kick in.
You can now safely run at 90% storage utilization, compared to the standard 70% utilization. This means you need fewer underlying servers and have substantially less infrastructure to pay for.
Optimizations like file-based streaming and dictionary-based compression also speed up scaling and reduce network costs.

Beyond the technical changes, there’s also an important pricing update. To go along with all this database flexibility, we’re now offering a “Flex Credit” pricing model. Basically, this gives you the flexibility of on-demand pricing with the cost advantage that comes from an annual commitment.

Access ScyllaDB X Cloud Now

If you want to get started right away, just go to ScyllaDB Cloud and choose the X Cloud cluster type when you create a cluster. This is our code name for the new type of cluster that enables greater elasticity, higher storage utilization, and automatic scaling. Note that X Cloud clusters are available from the ScyllaDB Cloud application (below) and API. They’re available on AWS and GCP, running on a ScyllaDB account or your company’s account with the Bring Your Own Account (BYOA) model.

Sneak peek: In the next release, you won’t need to choose instance size or number of services if you select the X Cloud option. Instead, you will be able to define a serverless scaling policy and let X Cloud scale the cluster as required.

If you want to learn more, keep reading. In this blog post, we’ll cover what’s behind the technical changes and also talk a little about the new pricing option. But first, let’s start with the why.

Backstory

Why did we do this? Consider this example from a marketing/AdTech platform that provides event-based targeting.

Such a pattern, with predictable/cyclical daily peaks and low baseline off-hours, is quite common across retail platforms, food delivery services, and other applications aligned with customer work hours.

In this case, the peak loads are 3x the base and require 2-3x the resources. With ScyllaDB X Cloud, they can provision for the baseline and quickly scale in/out as needed to serve the peaks. They get the steady low latency they need without having to overprovision – paying for peak capacity 24/7 when it’s really only needed for 4 hours a day.

Tablets + just-in-time autoscaling

If you follow ScyllaDB, you know that tablets aren’t new. We introduced them last year for ScyllaDB Enterprise (self-managed on the cloud or on-prem). Avi Kivity, our CTO, already provided a look at why and how we implemented tablets. And you can see tablets in action here:

With tablets, data gets distributed by splitting tables into smaller logical pieces (“tablets”), which are dynamically balanced across the cluster using the Raft consensus protocol. This enables you to scale your databases as rapidly as you can scale your infrastructure.

In a self-managed ScyllaDB deployment, tablets makes it much faster and simpler to expand and reduce your database capacity. However, you still need to plan ahead for expansion and initiate the operations yourself.

ScyllaDB X Cloud lets you take full advantage of tablets’ elasticity. Scaling can be triggered automatically based on storage capacity (more on this below) or based on your knowledge of expected usage patterns. Moreover, as capacity expands and contracts, we’ll automatically optimize both node count and utilization. You don’t even have to choose node size; ScyllaDB X Cloud’s storage-utilization target does that for you. This should simplify admin and also save costs.

90% storage utilization

ScyllaDB has always handled running at 100% compute utilization well by having automated internal schedulers manage compactions, repairs, and lower-priority tasks in a way that prioritizes performance. Now, it also does two things that let you increase the maximum storage utilization to 90%:

Since tablets can move data to new nodes so much faster, ScyllaDB X Cloud can defer scaling until the very last minute
Support for mixed instance sizes allows ScyllaDB X Cloud to allocate minimal additional resources to keep the usage close to 90%

Previously, we recommended adding nodes at 70% capacity. This was because node additions were unpredictable and slow — sometimes taking hours or days — and you risked running out of space. We’d send a soft alert at 50% and automatically add nodes at 70%. However, those big nodes often sat underutilized.

With ScyllaDB X Cloud’s tablets architecture, we can safely target 90% utilization. That’s particularly helpful for teams with storage-bound workloads.

Support for mixed size clusters

A little more on the “mixed instance size” support mentioned earlier. Basically, this means that ScyllaDB X Cloud can now add the exact mix of nodes you need to meet the exact capacity you need at any given time.

Previous versions of ScyllaDB used a single instance size across all nodes in the cluster. For example, if you had a cluster with 3 i4i.16xlarge instances, increasing the capacity meant adding another i4i.16xlarge. That works, but it’s wasteful: you’re paying for a big node that you might not immediately need.

Now with ScyllaDB X Cloud (thanks to tablets and support for mixed-instance sizes), we can scale in much smaller increments. You can add tiny instances first, then replace them with larger ones if needed. That means you rarely pay for unused capacity.

For example, before, if you started with an i4i.16xlarge node that had 15 TB of storage and you hit 70% utilization, you had to launch another i4i.16xlarge — adding 15 TB at once. With ScyllaDB X Cloud, you might add two xlarge nodes (2 TB each) first. Then, if you need more storage, you add more small nodes, then eventually replace them with larger nodes. And by the way, i7i instances are now available too, and they are even more powerful.

The key is granular, just-in-time scaling: you only add what you need, when you need it.

This applies in reverse, too. Before, you had to decommission a large node all at once. Now, ScyllaDB X Cloud can remove smaller nodes gradually based on the policies you set, saving compute and storage costs.

Network-focused engineering optimizations

Every gigabyte leaving a node, crossing an Availability Zone (AZ) boundary, or replicating to another region shows up on your AWS, GCP, or Azure bill. That’s why we’ve done some engineering work at different layers of ScyllaDB to shrink those bytes—and the dollars tied to them.

File-based streaming

We anticipated that mutation-based streaming would hold us back once we moved to tablets. So we shifted to a new approach: stream the entire SSTable files without deserializing them into mutation fragments and re-serializing them back into SSTables on receiving nodes. As a result, less data is streamed over the network and less CPU is consumed, especially for data models that contain small cells. Think of it as Cassandra’s zero-copy streaming, except that we keep ownership metadata with each replica.

This table shows the result:

You can read more about this in the blog Why We Changed ScyllaDB’s Data Streaming Approach.

Dictionary-based compression

We also introduced dictionary-trained Zstandard (Zstd), which is pipeline-aware. This involved building a custom RPC compressor with external dictionary support, and a mechanism that trains new dictionaries on RPC traffic, distributes them over the cluster, and performs a live switch of connections to the new dictionaries. This is done in 4 key steps:

Sample: Continuously sample RPC traffic for some time
Train: Train a 100 kiB dictionary on a 16MiB sample
Distribute: Distribute a new dictionary via system distributed table
Switch: Negotiate the switch separately within each connection

On the graph below, you can see LZ4 (Cassandra’s default) leaves you at 72% of the original size. Generic Zstd cuts that to 50%. Our per-cluster Zstd dictionary takes it down to 30%, which is a 3X improvement over the default Cassandra compression.

Flex Credit

To close, let’s shift from the technical changes to a major pricing change: Flex Credit. Flex Credit is a new way to consume a ScyllaDB Cloud subscription. It can be applied to ScyllaDB Cloud as well as ScyllaDB Enterprise. Flex Credit provides the flexibility of on-demand pricing at a lower cost via an annual commitment.

In combination with X Cloud, Flex Credit can be a great tool to reduce cost. You can use Reserved pricing for a load that’s known in advance and use Flex for less predictable bursts. This saves you from paying the higher on-demand pricing for anything above the reserved part.

How might this play out in your day-to-day work? Imagine your baseline workload handles 100K OPS, but sometimes it spikes to 400K OPS. Previously, you’d have to provision (and pay for) enough capacity to sustain 400K OPS at all times. That’s inefficient and costly.

With ScyllaDB X Cloud, you reserve 100K OPS upfront. When a spike hits, we automatically call the API to spin up “flex capacity” – instantly scaling you to 400K OPS – and then tear it down when traffic subsides. You only pay for the extra capacity during the peak.

Not sure what to choose? We can help advise based on your workload specifics (contact your representative or ping us here), but here’s some quick guidance in the meantime.

Reserved Capacity: The most cost-effective option across all plans. Commit to a set number of cluster nodes or machines for a year. You lock in lower rates and guarantee capacity availability. This is ideal if your cluster size is relatively stable.
Hybrid Model: Reserved + On-Demand: Commit to a baseline reserved capacity to lock in lower rates, but if you exceed that baseline (e.g., because you have a traffic spike), you can scale with on-demand capacity at an hourly rate. This is good if your usage is mostly stable but occasionally spikes.
Hybrid Model: Reserved + Flex Credit: Commit to baseline reserved capacity for the lowest rates. For peak usage, use pre-purchased flex credit (which is discounted) instead of paying on-demand prices. Flex credit also applies to network and backup usage at standard provider rates. This is ideal if you have predictable peak periods (e.g., seasonal spikes, event-driven workload surges, etc.). You get the best of both worlds: low baseline costs and cost-efficient peak capacity.

Recap

In summary, ScyllaDB X Cloud uses tablets to enable faster, more granular scaling with mixed-instance sizes. This lets you avoid overprovisioning and safely run at 90% storage utilization. All of this will help you respond to volatile/unpredictable demand with low latencies and low costs. Moreover, flexible pricing (on-demand, flex credit, reserved) will help you pay only for what you need, especially when you have tablets scaling your capacity up and down in response to traffic spikes. There are also some network cost optimizations through file-based streaming and improved compression.

Want to learn more? Our Co-Founder/CTO Avi Kivity will be discussing the design decisions behind ScyllaDB X Cloud’s elasticity and efficiency. Join us for the engineering deep dive on July 10.

ScyllaDB X Cloud: An Inside Look with Avi Kivity