Developers | Wallaroo
  • White Twitter Icon
  • White LinkedIn Icon

© 2019 by Wallaroo Labs, Inc.
Made with  in NYC.

Ready for some Python 3 with your Wallaroo?

Wallaroo 0.6.0 was released with enhanced Python 3 support. 

Check out the release notes.

Overview

Wallaroo makes it easy to get stateful, analytics, stream processing, and event-driven AI applications to production fast, regardless of scale. It provides APIs in several languages for developers to implement their custom business logic.

We've made Wallaroo with Python APIs available as open source under Apache 2.0. It's the only Python framework purpose-built for applications that require state management, complex workflows, bursty capacity demands, or the need to run anywhere – on-prem, at the edge, or on any cloud.

Wallaroo takes care of the infrastructure required to run distributed, Python-based applications.

It handles scaling, resilience, state management and message delivery, making it easy to scale applications with minimal code changes. This allows you to focus on data science and business logic, not plumbing and data engineering.

Our lightweight API currently supports Python. It's easy to define the sources, sinks and computations necessary for your event-driven application:

 
 

Why We Built Wallaroo

We started out - and continue to this day - supporting organizations that are looking to horizontally scale business-critical data applications.

We designed and ran high volume, low-latency, fault tolerant systems and quickly realized that the technologies available to us were insufficient. We had the right ideas, but it took too long to get them into production; it was too hard and costly to scale and operate them.

When we set out to build Wallaroo, we had a few goals in mind. We wanted to improve the state of the art for distributed data processing by providing:

  • better per-worker throughput

  • dramatically lower latencies

  • easier state management, and

  • an easier operational experience. 

It became clear that the best way to advance the industry was to make our code available to the open source community through the Apache, version 2 license. Admittedly, we didn't get this right at first (we had some loose restrictions around a few aspects of our software) but we sorted it out

What Wallaroo Does

Wallaroo is an ultrafast and scalable data processing engine that rapidly takes you from prototype to production by eliminating infrastructure complexity.

 

It's designed for writing distributed data processing applications, especially demanding high-throughput, low-latency tasks where the accuracy of results is important. 

  • Straightforward APIan approachable API for building stateful and stateless data pipelines.
     

  • Elastic, ad-hoc clustering - join processes together as needed to handle your demand. No cluster manager needed.
     

  • Fault-toleranceWallaroo manages your state and allows you to get accurate results in the face of failure.
     

  • High-performance runtimeWe've designed Wallaroo to add as little overhead as possible, giving you more room to build your application.
     

  • Built-in metricsWallaroo collects metrics on all your pipelines and applications. See your throughputs and latencies for easy troubleshooting and debugging.

 
 

Quick Start

Get started and jump right in!


We recommend you use our Docker image, which allows you to get Wallaroo up and running in only a few minutes:

 

  1. Install Docker

  2. Run docker pull wallaroo-labs-docker-wallaroolabs.bintray.io/release/wallaroo:latest

  3. Refer to instructions for running example applications in Docker

How to install Wallaroo on other platforms.

Examples

We frequently talk about specific examples with code examples on our engineering blog.  Below are some of our recent favorites.  Here is a complete list of examples from our blog.

In this example, we build an Adtech use case where Wallaroo is used to join several event streams to determine if the user taking action is a “loyal customer,” and should receive a promotional discount. See the walkthrough and working example code.

Event-Triggered Customer Segmentation

In this example, we look at creating a simple spam detection application and monitor the number of distinct chat messages that each user generates. The write-up includes walkthrough, working example code, and next steps needed to move this solution into production.

Detecting Spam as it Happens

In this example, we look at creating of chaining state computations together so the output of one could be fed into another to take still further action in a twitter hashtags example. See the walkthrough and working example code.

Stream Processing, Trending Hashtags

Questions? Want a sounding board for an idea or approach?

We love to help.