Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance

In this talk I walk you through the performance tuning steps that I took to serve 1.2M JSON requests per second from a 4 vCPU c5 instance, using a simple API server written in C. At the start of the journey the server is capable of a very respectable 224k req/s with the default configuration. Along the way I made extensive use of tools like FlameGraph and bpftrace to measure, analyze, and optimize the entire stack, from the application framework, to the network driver, all the way down to the kernel. I began this wild adventure without any prior low-level performance optimization experience; but once I started going down the performance tuning rabbit-hole, there was no turning back. Fueled by my curiosity, willingness to learn, and relentless persistence, I was able to boost performance by over 400% and reduce p99 latency by almost 80%.

Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance

Data Modeling for Online Feature Store: An Example

Real-time ML with ScyllaDB as the Online Feature Store: Price Prediction Example

ScyllaDB Workload Prioritization Demo

ScyllaDB vs. DynamoDB: 5 Minute Demo

Demo: Predictable Performance at Scale to 2M OPS

Predictable Performance at Scale to 2M OPS

ScyllaDB vs Cassandra: 5 Minute Demo

Inside Tripadvisor’s Real-time Personalization with ScyllaDB and AWS

ScyllaDB Scaling in Action

ScyllaDB at 1M+ OPS

Exploring ScyllaDB's Architecture and Engineering Optimizations

Tutorial: Video Streaming Application built with NextJS

Tutorial: Running ScyllaDB at 1M Operations per Second

What Makes ScyllaDB So Fast?

Sizing Considerations for ScyllaDB

How Discord Migrated Trillions of Messages to ScyllaDB

Indexes, Filters, and Other Animals

What is ScyllaDB ?

Comcast Reduces P99 Latencies by 95% with ScyllaDB

Workshop: Build Low-Latency Applications in Rust