Wallaroo makes it easy to get stateful, analytics, stream processing, and event-driven AI applications to production fast, regardless of scale. It provides APIs in several languages for developers to implement their custom business logic.
We've made Wallaroo with Python APIs available as open source under Apache 2.0. It's the only Python framework purpose-built for applications that require state management, complex workflows, bursty capacity demands, or the need to run anywhere – on-prem, at the edge, or on any cloud.
Wallaroo takes care of the infrastructure required to run distributed, Python-based applications.
It handles scaling, resilience, state management and message delivery, making it easy to scale applications with minimal code changes. This allows you to focus on data science and business logic, not plumbing and data engineering.
Our lightweight API currently supports Python. It's easy to define the sources, sinks and computations necessary for your event-driven application:
Why We Built Wallaroo
We started out - and continue to this day - supporting organizations that are looking to horizontally scale business-critical data applications.
We designed and ran high volume, low-latency, fault tolerant systems and quickly realized that the technologies available to us were insufficient. We had the right ideas, but it took too long to get them into production; it was too hard and costly to scale and operate them.
When we set out to build Wallaroo, we had a few goals in mind. We wanted to improve the state of the art for distributed data processing by providing:
better per-worker throughput
dramatically lower latencies
easier state management, and
an easier operational experience.
It became clear that the best way to advance the industry was to make our code available to the open source community through the Apache, version 2 license. Admittedly, we didn't get this right at first (we had some loose restrictions around a few aspects of our software) but we sorted it out.
What Wallaroo Does
Wallaroo is an ultrafast and scalable data processing engine that rapidly takes you from prototype to production by eliminating infrastructure complexity.
It's designed for writing distributed data processing applications, especially demanding high-throughput, low-latency tasks where the accuracy of results is important.
Straightforward API - an approachable API for building stateful and stateless data pipelines.
Elastic, ad-hoc clustering - join processes together as needed to handle your demand. No cluster manager needed.
Fault-tolerance - Wallaroo manages your state and allows you to get accurate results in the face of failure.
High-performance runtime - We've designed Wallaroo to add as little overhead as possible, giving you more room to build your application.
Built-in metrics - Wallaroo collects metrics on all your pipelines and applications. See your throughputs and latencies for easy troubleshooting and debugging.
Get started and jump right in!
We recommend you use our Docker image, which allows you to get Wallaroo up and running in only a few minutes:
Run docker pull wallaroo-labs-docker-wallaroolabs.bintray.io/release/wallaroo:latest
Refer to instructions for running example applications in Docker
How to install Wallaroo on other platforms.