Dependencies

To build Arroyo, you will need to install some dependencies.

Follow these command to get the necessary dependencies to build Arroyo on Ubuntu:

$ sudo apt-get install pkg-config build-essential \
  libssl-dev openssl cmake curl postgresql postgresql-client
$ sudo systemctl start postgresql
$ wget https://github.com/protocolbuffers/protobuf/releases/download/v21.8/protoc-21.8-linux-x86_64.zip
$ unzip protoc-21.8-linux-x86_64.zip
$ sudo mv bin/protoc /usr/local/bin
$ curl https://sh.rustup.rs -sSf | sh -s -- -y
$ cargo install refinery_cli
$ curl -fsSL https://get.pnpm.io/install.sh | sh -

Get the source

The Arroyo source can be checked out from GitHub with:

$ git clone https://github.com/ArroyoSystems/arroyo.git

Postgres

Developing Arroyo requires running a properly configured postgres instance. By default, it expects a database called arroyo, a user arroyo with password arroyo, although that can be changed by setting the following environment variables:

  • DATABASE_NAME
  • DATABASE_HOST
  • DATABASE_PORT
  • DATABASE_USER
  • DATABASE_PASSWORD

On Ubuntu, you can set up a compatible database like this:

$ sudo -u postgres psql -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ sudo -u postgres createdb arroyo

On MacOS:

$ psql postgres -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ createdb arroyo

Migrations are managed using refinery. Once Postgres is set up, you can initialize it with

$ refinery setup # follow the prompts
$ mv refinery.toml ~/
$ refinery migrate -c ~/refinery.toml -p crates/arroyo-api/migrations

We use cornucopia for typed-checked SQL queries. Our build is set up to automatically re-generate the rust code for those queries on build, so you will need a DB set up for compilation as well.

Building the frontend

We use pnpm and vite for frontend development. Once pnpm is installed (following the OS-specific) instructions above, you should be able to build the console like this:

$ cd webui
$ pnpm install
$ pnpm build

This must be done before running the services, as the API service will expect to find the compiled webui source so it can serve it.

Building the services

Arroyo is built via Cargo, the Rust build tool. All services are built into a single binary, arroyo, which can be built with

$ cargo build --package arroyo

By default, Cargo builds in debug mode which gives you shorter build times in exchange for slower execution speeds. This is great for development, but you should make sure to always build in release mode (cargo build --release) before deploying or benchmarking.

This will build the binary in target/debug/arroyo. Running that you should see the available subcommands:

$ target/debug/arroyo
Usage: arroyo <COMMAND>

Commands:
  api         Starts an Arroyo API server
  controller  Starts an Arroyo Controller
  cluster     Starts a complete Arroyo cluster
  worker      Starts an Arroyo worker
  compiler    Starts an Arroyo compiler
  node        Starts an Arroyo node server
  migrate     Runs database migrations on the configure Postgres database
  help        Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help
  -V, --version  Print version

Running a cluster

All of the services can be run together in a single cluster, or individually. For development, it’s most typical to run them together using the cluster subcommand:

$ target/debug/arroyo cluster

This will start a new Arroyo cluster with the default “process” scheduler, which runs each pipeline as a separate process on the same node.

You can then open http://localhost:5115 to use your new cluster.

Cargo also allows combining the build-run process into a single step, with the cargo run command, for example:

$ cargo run --bin arroyo -- cluster

On MacOS, you may see this error in the controller logs when trying to run a job:

Child ("/tmp/arroyo-process/m2lgldjw") exited with status Ok(ExitStatus(unix_wait_status(9)))

This means that MacOS gatekeeper is killing the pipeline binary. To allow the pipeline to run, open System Settings and navigate to Privacy & Security -> Developer Tools. In the list, add and enable your terminal (for example Terminal.app or iTerm).

Front-end development

When developing the frontend, you can take advantage of vite’s dev server to shorten the dev cycle:

$ cd webui
$ pnpm run dev

Then visit http://localhost:5173/. This requires the cluster is running on http://localhost:5115.

Updating frontend definitions

After altering protobuf definitions in arroyo-rpc you need to regenerate the Typescript code used by the frontend:

$ pnpm run generate

Similarly, after altering REST API types in arroyo-api you need to run:

$ pnpm run openapi

Testing

Arroyo includes a large suite of tests covering most parts of the system. We use the standard Rust test framework, which can be invoked via

$ cargo test

We also recommend cargo-nextest, which provides a better UX on top of the standard test runner.

$ cargo nextest run

See the nextest docs for all options.

Connector tests may require a running local instance of that system to pass; for example the full test suite currently requires kafka and an mqtt broker.

Building docker

The Arroyo repo includes a Dockerfile to package Arroyo for development and deployment via Kubernetes or Docker Compose.

For efficiency, we use targets to build images for different purposes. The available targets are

  • arroyo — includes only the Arroyo binary; most useful for deploying on Kubernetes or in Docker Compose
  • arroyo-full — also includes a Rust compile environment; this is needed for building UDFs within the Web UI; the compiler service will dynamically fetch the necessary dependencies, but you may choose to use arroyo-full if you’re running in an environment that does not allow external internet access

To build a particular target, run

$ docker build . -f docker/Dockerfile -t {target} --target {target}

for example,

$ docker build . -f docker/Dockerfile -t arroyo-full --target arroyo-full

Note that this requires a fairly recent version of Docker that supports buildx.