The Wallaroo Platform
A highly efficient stream processing engine
+ data connectors, observability, and managed operations.
At the core of our platform is a processing engine built to meet the speed and accuracy needs of financial trading and the scale requirements of IoT. It runs as native code with performance that existing open-source tools can’t match. Sub-millisecond latencies, trillions of events and petabytes of data a day, while using up to 80% fewer servers.
Combined with our data connectors, observability tools, and managed operations, we get your distributed data processing applications live easily, efficiently, and at scale.
Deploy on-premise, in any cloud, at the edge.
Predictable sub-millisecond compute latencies.
Resilient in the face of failure.
Supports effectively-once processing.
Allows for all manner of stateful operations on the data.
Built-in monitoring and audit
Common analytics functions.
Processes data on an event-by-event basis.
Performance means you can:
Solve more problems with the same budget
Run more algorithms or more advanced algorithms
Use more data to drive innovation
Writing high-performance distributed computing applications is hard. We architected our engine from the get-go to have few bottlenecks, and we’ve obsessed over the performance implications of every feature we add, carefully testing, measuring and optimizing the impact of each one.
That means you don’t have to. Instead, you invest your effort in accelerating your business.
Ad hoc clustering
We support ad hoc clustering and autoscaling. Growing a Wallaroo cluster is as easy as adding new Wallaroo processes, and shrinking as easy as sending in a new message. This allows straightforward integration with a variety of orchestration tools.
We don’t want to get in the way of your current workflows.
Scaling applications is difficult. Scaling stateful application when you care about performance is a herculean task. Wallaroo can scale applications while maintaining performance by keeping horizontal scaling concerns separate from your data algorithms.
Applications can be scaled from a single process up to your natural limit of parallelization without needing to change any code. Workers can be added or removed from a cluster without requiring downtime.
Calling out to external services is expensive, and sacrifices control over the kinds of guarantees you can make around performance, correctness, and resilience. Wallaroo’s default is to manage all state in-memory, distributing partitions across the cluster via random slicing techniques. This means faster computations and stricter control over checkpointing and recovering state in the face of failure.
It also gives your greater flexibility for the types of algorithms you can deploy - for example using recent data events to detect an anomaly, or fusing different sources of data together before performing further analytics.
We also provide built-in aggregation, windowing, and statistical functions, allowing you to deploy stateful applications with a minimum of coding.
Assuring that every input is accounted for and that you get the results you expect is difficult; doing it in the face of all the failures that can befall modern data processing applications is incredibly hard.
We built Wallaroo so our customers could stop worrying about the toughest failure recovery problems. We’ve built failure recovery into the core of Wallaroo and regularly test that it works as expected using advanced techniques like fault injection and chaos engineering.
Monitoring and Observability
Our Monitoring Hub provides visibility into the performance of the system, not only end-to-end but at a finer granularity, revealing throughput and latency distributions over time, and helping to determine where the bottlenecks are.
The Monitoring Hub UI provides throughput and latency statistics end-to-end, per-process, per-computation, per-source, and also at the level of serialization/deserialization. You can also integrate these performance metrics into your own observability tools.
We also provide customizable observability data tailored to your vertical and use case. For example, we provide support for audit streams, useful for compliance or ML explainability.
A stream processing engine isn’t much good if you can’t get data in and out of it. That’s why we built our Connectors framework - to reduce the costs of developing new data Connectors that are able to support effectively-once processing. This means managing acking and replayability on the source side, and two phase commit on the sink side.
By relying on our Connectors framework, we can quickly build integrations for new clients and expand the range of out-of-the-box Connectors. This is as true of our client’s home-grown data sources and sinks as it is of popular technologies like Kafka or the high-performance JSON-parsing HTTP server we built for adtech and IoT.
We’ve added support for ML model deployment, allowing you to maintain your existing data science workflows and benefit from Wallaroo’s advantages without having to implement models from the ground up or tie yourself to a particular cloud provider.
We're also building support for vertical-specific features by partnering closely with customers. This includes purpose-built Connectors, and vertical-specific analytics libraries and feature extraction libraries for data initiatives.
Finally, we work with your data science, analytics, and engineering teams to help them get data applications live as easily and efficiently as possible.
Anyone can build data processing applications on top of the Wallaroo engine, which is open-sourced. But for those who want a higher-level API to a managed service, we’ve put the Wallaroo engine at the center of our Wallaroo platform.
We take care of the hard distributed systems and data engineering problems so you can focus on your business domain.