Wallaroo | Technology
  • White Twitter Icon
  • White LinkedIn Icon

© 2019 by Wallaroo Labs, Inc.
Made with  in NYC.

Launching data applications 
is a long, complex process. 
Simplify it.

Our team has unique expertise in large-scale data applications and the practical application of deep academic work in distributed computing. We used these skills to build Wallaroo, our next-generation stream processing engine that can extract & act on insights rapidly, at immense scale, and using very little infrastructure.

 

For many use-cases Wallaroo delivers performance and cost efficiencies that tools like Spark Streaming just can’t meet.

Out-of-the-Box

Distributed Computing Expertise

Wallaroo combines a highly performant, stateful compute engine with on-demand scaling. Together with our data connectors, it can easily plug into your data ecosystem. 

We built Wallaroo with a focus on performance, resiliency, and scalability. It provides APIs that allow developers to quickly implement business logic, while it takes care of:

  • Distributed state management

  • Metrics and Monitoring

  • Failure Recovery

  • Cluster formation

  • Application scaling

That means your prototype code is production-ready, you can scale on-demand, and operate with confidence. It also means your team is much more productive.

Performance

At Wallaroo Labs, we believe that doing more with less is powerful and liberating. Performance means you can: 

 

  • Solve more problems with the same budget

  • Run more applications

  • Use more data to drive innovation & profitability


Writing high-performance software is not a skill that everyone has developed. The money and time we save you via improved efficiency can be invested in other valuable projects to grow your business.

 

We’ve developed Wallaroo to make it easier to write high-performance, distributed data processing applications. In less time, and for less money.

 

We’ve obsessed over the performance implications of every feature we add, carefully measuring and optimizing the impact of each one. That means you don’t have to.

Learn more about how Wallaroo was built for performance in our introductory whitepaper.

Industrial-Grade by Default

Performance matters, but correctness matters more. There’s not a lot of value in getting incorrect results quickly. Assuring that every input is accounted for and that you get the results you expect is difficult; doing it in the face of all the failures that can befall modern data processing applications is incredibly hard.
 
We’ve built a lot of data processing applications throughout our careers. We almost always messed up completeness or correctness when handling failure. We’ve built Wallaroo to help developers stop worrying about the toughest failure recovery problems. We’ve built failure recovery into the core of Wallaroo and regularly test that it works as expected using advanced techniques like fault injection and chaos engineering.

On-Demand Horizontal Scaling

Scaling applications is difficult. Scaling stateful application when you care about performance is a herculean task. At Wallaroo Labs, we believe that developers should be able to focus on their domain of expertise, their business logic, instead of having to think about distributed systems problems. We’ve designed Wallaroo with a scale-independent API that keeps horizontal scaling concerns from bleeding into application code.

With Wallaroo’s scale-independent API, you describe:

  • The flow of data through your application

  • The computations and transformations to be run

  • The state you accumulate along the way

 

Wallaroo applications can be scaled from a single process up to your natural limit of parallelization without needing to change any code. Add workers to a Wallaroo cluster, remove them, all without requiring downtime.

Check out a demo of our scale-independent API in action here.

Flexible Options

Wallaroo is available via a managed service or fully-supported license.  A version of Wallaroo’s core compute engine for Python is available as an open source project under Apache 2.0 - great for data science algorithms, but without the raw throughput of other language bindings.

You decide which works best for you depending on your use-case, support, and performance needs.