Numberly: Learning Rust the Hard Way for Kafka + ScyllaDB in Production
Alexys Jacob is CTO of Numberly, a French digital data marketing powerhouse whose experts and systems help brands connect with their customers using all digital channels available. The developer community is quite familiar with Alexys, where he is known as “Ultrabug,” working on a variety of open source projects, such as ScyllaDB, Python and Gentoo Linux.
Alexys has a penchant to accept deep technical challenges in gaining performance with ScyllaDB. Such as when he delved into what it takes to make a shard-aware Python driver (which you can read more about here and here) — gaining between 15% to 25% better throughput. Or when he was able to get Spark job processing with ScyllaDB down from 12 minutes to just over 1 minute.
Alexys Jacob, basically.
At a recent webinar Alexys described a new performance challenge he set for himself and for Numberly: to move a key element of their code from Python to Rust, in order to accelerate a data pipeline powered by ScyllaDB and Apache Kafka for event streaming.
He began by describing the reasoning behind such a decision. It comes from Numberly’s use of event streaming and specialized data pipelining applications, which they call data processor applications. (You can read more about how Numberly uses Kafka with ScyllaDB.)
Each such data processor prepares and enriches the incoming data so that it is useful to the downstream business partner or client applications in a timely manner. Alexys emphasized, “availability and latency are business critical to us. Latency and resilience are the pillars upon which we have to build our platforms to make our business reliable in the face of our clients and partners.” Simply put, for Numberly to succeed Kafka and ScyllaDB can’t fail.
Over the past five years Numberly relied heavily upon three of the most demanding data processors which had been written in Python. Alexys noted the risks and natural reluctance to change them out: “They were battle tested and trustworthy. We knew them by heart.”
During that same time, Alexys kept tabs on the maturation of Rust as a development language. His natural curiosity and desire to improve his skills and Numberly’s capabilities drove him to consider switching out these data processors to Rust. It wasn’t a decision to be taken lightly.
Regarding Rust, Alexys commented, “it felt less intimidating to me than C or C++ — sorry Avi.” So when this opportunity came he went to his colleagues and suggested rewriting them in Rust. The internal response was, at first, less than enthusiastic. There was no Rust expertise at Numberly. This would be Alexys’ first project with the language. There were major risks with this particular bit of code. “Okay, I must admit that I lost my CTO badge for a few seconds when I saw their faces.”
Alexys needed to justify the decision with clear rationale, and delineated the promises Rust makes. “It’s supposed to be secure, easy to deploy, makes few or no [performance] compromises and it also plays well with Python.But furthermore their marketing motto speaks to the marketer inside me: ‘a language empowering everyone to build reliable and efficient software.’”
This resonated strongly with Alexys. “That’s me. Reliable and efficient software.” Alexys noted that ‘efficient’ software is not precisely synonymous with ‘fastest’ software. “Brett Cannon, a Python core developer, advocates that selecting a programming language for being faster on paper is a form of premature optimization.”
Alexys enumerated just a few possible meanings of “fast:”
- Fast to develop?
- Fast to maintain?
- Fast to prototype?
- Fast to process data?
- Fast to cover all failure cases?
“I agree with him in the sense that the word ‘fast’ has different meanings depending on your objectives. To me Rust can be said to be faster as a consequence of being efficient, which does not cover all the items on the list here.”
Applying them to the Numberly context, no, Rust would not be faster to develop than Python, since there was a learning curve involved. Whereas Alexys had over 15 years of Python experience.
Nor would it be faster to maintain, since they had not yet made Rust an operational language in their production environment.
Would it be faster to prototype? Again, no, since unlike the immediacy of interpreted Python there would need to be compile times involved.
Would it be faster to process data? On paper, yes. That was the key reason to adopt Rust, and it was an educated guess that it would perform faster than Python. They still needed to prove it and measure the gains. Because right now, Python had proven to be “fast enough.”
Alexys asked the not-so-rhetorical questions he was facing. “So why would I want to lose time? The short answer is innovation. Innovation cannot exist if you don’t accept to lose time. The question is to know when and on what project.” Alexys had an inner surety this project was the right one at the right time. While Rust would make him slow at first, its unique design would provide more reliable software. Stronger software. Type safe. More predictable, readable and maintainable.
Alexys quipped, “It’s still more helpful to have a compiler error that is explained very well than a random Python exception.”
Plus, not to be overlooked, Rust would provide for better dependency management. “It looks sane compared to what I’m used to in Python. Exhaustive pattern matching that brings confidence that you’re not forgetting something while you code. And when you compile error management primitives — failure handling right in the language syntax.”
Rust’s bottom line for Alexys was clear: “I chose Rust because it provided me with the programming language at the right level abstractions and the right paradigms. This is what I needed to finally understand and better explain the reliability and performance of an application.”
Production is not a “Hello World”
Learning Rust the hard way meant more than just tackling semicolons and brackets. This wasn’t going to be a simple “hello world” science project. It meant dealing with high stakes and going straight into production. The data processing app written in Rust needed to be integrated into their Kubernetes orchestration mechanism, and observable via Prometheus, Grafana, and Sentry. It needed error handling, latency optimization, integration with their Avro schema registry, bridge successfully between Confluent Kafka and their multi-datacenter ScyllaDB deployment, and more.
Watch the Session in Full
This is just the beginning of the challenge. To see how Alexys and the team at Numberly solved it and successfully moved from Python to Rust, you can watch the full webinar on-demand below. You can also view the slides here.
Lastly, you can read the blog Alexys wrote regarding the Rust implementation on Numberly’s own blog here.
Get Started with ScyllaDB
If you’d like to learn more about using ScyllaDB in your own event streaming platform, feel free to contact us directly, or join our vibrant Slack community.
If you want to jump right on in you can take free courses in using Kafka and ScyllaDB on ScyllaDB University, and get started by downloading ScyllaDB Open Source, or creating an account on ScyllaDB Cloud.