How Yieldmo Cut Database Costs and Cloud Dependencies
“The entire process of delivering an ad occurs within 200 to 300 milliseconds. Our database lookups must complete in single-digit milliseconds. With billions of transactions daily, the database has to be fast, scalable, and reliable. If it goes down, our ad-serving infrastructure ceases to function.” – Todd Coleman, technical co-founder and chief architect at Yieldmo
Yieldmo’s online advertising business depends on processing hundreds of billions of daily ad requests with subsecond latency responses. The company’s services initially depended on DynamoDB, which the team valued for simplicity and stability. However, DynamoDB costs were becoming unsustainable at scale and the team needed multicloud flexibility as Yieldmo expanded to new regions. An infrastructure choice was threatening to become a business constraint.
In a recent talk at Monster SCALE Summit, Todd Coleman, Yieldmo’s technical co-founder and chief architect, shared the technical challenges the company faced and why the team ultimately moved forward with ScyllaDB’s DynamoDB-compatible API.
You can watch his complete talk below or keep reading for a recap.
Lag = Lost Business
Yieldmo is an online advertising platform that connects publishers and advertisers in real time as a page loads. Nearly every ad request triggers a database query that retrieves machine learning insights and device-identity information. These queries enable its ad servers to:
- Run effective auctions
- Help partners decide whether to bid
- Track which ads they’ve already shown to a device so advertisers can manage frequency caps and optimize ad delivery
The entire ad pipeline completes in a mere 200 to 300 milliseconds, with most of that time consumed by partners evaluating and placing bids. More specifically:
- When a user visits a website, an ad request is sent to Yieldmo.
- Yieldmo’s platform analyzes the request.
- It solicits potential ads from its partners.
- It conducts an auction to determine the winning bid.
The database lookup must happen before any calls to partners. And these lookups must complete with single-digit millisecond latencies. Coleman explained, “With billions of transactions daily, the database has to be fast, scalable and reliable. If it goes down, our ad-serving infrastructure ceases to function.”
DynamoDB Growing Pains
Yieldmo’s production infrastructure runs on AWS, so DynamoDB was a logical choice as the team built their app. DynamoDB proved simple and reliable, but two significant challenges emerged.
First, DynamoDB was becoming increasingly expensive as the business scaled. Second, the company wanted the option to run ad servers on cloud providers beyond AWS.
Coleman shared, “In some regions, for example, the US East Coast, AWS and GCP [Google Cloud Platform] data centers are close enough that latency is minimal. There, it’s no problem to hit our DynamoDB database from an ad server running in GCP. However, when we attempted to launch a GCP-based ad-serving cluster in Amsterdam while accessing DynamoDB in Dublin, the latency was far too high. We quickly realized that if we wanted true multicloud flexibility, we needed a database that could be deployed anywhere.”
DynamoDB Alternatives
Yieldmo’s team started exploring DynamoDB alternatives that would suit their extremely read-heavy database workloads. Their write operations fall into two categories:
- A continuous stream of real-time data from their partners, essential for matching Yieldmo’s data with theirs
- Batch updates driven by machine learning insights derived from their historical data
Given this balance of high-frequency reads and structured writes, they were looking for a database that could handle large-scale, low-latency access while efficiently managing concurrent updates without degradation in performance.
The team first considered staying with DynamoDB and adding a caching layer. However, they found that caching couldn’t fix the geographic latency issue and cache misses would be even slower with this option.
They also explored Aerospike, which offered speed and cross-cloud support. However, they learned that Aerospike’s in-memory indexing would have required a prohibitively large and expensive cluster to handle Yieldmo’s large number of small data objects. Additionally, migrating to Aerospike would have required extensive and time-consuming code changes.
Then they discovered ScyllaDB, which also provided speed and cross-cloud support, but with a DynamoDB-compatible API (Alternator) and lower costs.
Coleman shared, “ScyllaDB supported cross-cloud deployments, required a manageable number of servers and offered competitive costs. Best of all, its API was DynamoDB-compatible, meaning we could migrate with minimal code changes. In fact, a single engineer implemented the necessary modifications in just a few days.”
ScyllaDB evaluation, migration and results
To start evaluating how ScyllaDB worked in their environment, the team migrated a subset of ad servers in a single region. This involved migrating multiple terabytes while keeping real-time updates. Process-wise, they had ScyllaDB’s Spark-based migration tool copy historical data, paused ML batch jobs and leveraged their Kafka architecture to replay recent writes into ScyllaDB. Moving a single DynamoDB table with ~28 billion objects (~3.3 TB) took about 10 hours.
The next step was to migrate all data across five AWS regions. This phase took about two weeks. After evaluating the performance, Yieldmo promoted ScyllaDB to primary status and eventually stopped writing to DynamoDB in most regions.
Reflecting on the migration almost a year later, Coleman summed up, “The biggest benefit is multicloud flexibility, but even without that, the migration was worthwhile. Database costs were cut roughly in half compared with DynamoDB, even with reserved-capacity pricing, and we saw modest latency improvements. ScyllaDB has proven reliable: Their team monitors our clusters, alerts us to issues and advises on scaling. Ongoing maintenance overhead is comparable to DynamoDB, but with greater independence and substantial cost savings.”
How ScyllaDB compares to DynamoDB