Dan Podhola, Principal Engineer at Zillow Group, discusses how Zillow Group uses ScyllaDB to run real-time workloads alongside updates of 6,500 records per second.
My name is Dan Podhola. I'm a Principal Engineer at Zillow. I've been at Zillow for a bit over eight years, I've mostly specialized in database performance and tuning.
My team is responsible for processing records about properties, and listing records like "for sale" and translating different message types into a common interchange format so our teams can talk.
Our performance has been great. Our normal real-time workloads, it's been ample. We run on 3 i3.4xlarge nodes for ScyllaDB. Our service actually runs on a single c5.xlarge that has 25 threads. It autoscales in EBS so if we happen to have like, a small spike, in real time we'll scale up to 3 c5.xlarges, turn to that data, and we'll be done.
The real beauty of this is even just on three nodes, we can scale up to 35 of the c5.xlarge instances, and we'll process over 6500 records a second, plus the real time workload. No one will even notice that we're processing entirety of Zillow's property and listings data in order-- maybe to correct some data issue or change a business rule. The beauty of that is we can process the entirety of the data at Zillow in less than a business day. And again, no performance hit to real-time data.