Dependencies

To build Arroyo, you will need to install some dependencies.

Ubuntu

$ sudo apt-get install pkg-config build-essential libssl-dev openssl cmake curl postgresql postgresql-client
$ sudo systemctl start postgresql
$ wget https://github.com/protocolbuffers/protobuf/releases/download/v21.8/protoc-21.8-linux-x86_64.zip
$ unzip protoc-21.8-linux-x86_64.zip
$ sudo mv bin/protoc /usr/local/bin
$ curl https://sh.rustup.rs -sSf | sh -s -- -y
$ cargo install wasm-pack
$ cargo install refinery_cli
$ curl -fsSL https://get.pnpm.io/install.sh | sh -
$ pnpm install --global @openapitools/openapi-generator-cli

MacOS

First, install Homebrew. Then, run the following commands:

$ brew install postgresql
$ brew install protobuf
$ brew install pnpm
$ brew install cmake

Then install Rust

$ curl https://sh.rustup.rs -sSf | sh -s -- -y

Finally, install the following packages:

$ cargo install wasm-pack
$ cargo install refinery_cli
$ cargo install cargo-nextest
$ pnpm install --global @openapitools/openapi-generator-cli

Postgres

Developing Arroyo requires running a properly configured postgres instance. By default, it expects a database called arroyo, a user arroyo with password arroyo, although that can be changed by setting the following environment variables:

  • DATABASE_NAME
  • DATABASE_HOST
  • DATABSE_PORT
  • DATABASE_USER
  • DATABASE_PASSWORD

On Ubuntu, you can setup a compatible database like this:

$ sudo -u postgres psql -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ sudo -u postgres createdb arroyo

On MacOS:

$ psql postgres -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ createdb arroyo

Migrations are managed using refinery. Once Postgres is set up, you can initalize it with

$ refinery setup # follow the prompts
$ mv refinery.toml ~/
$ refinery migrate -c ~/refinery.toml -p arroyo-api/migrations

We use cornucopia for typed-checked SQL queries. Our build is set up to automatically re-generate the rust code for those queries on build, so you will need a DB set up for compilation as well.

Building the services

Arroyo is built via Cargo, the Rust build tool. To build all of the services in release mode, you can run

$ cargo build --release

The control plane services (api, controller, and compiler) are not resource-intensive, and can be run without issue in debug mode for development:

$ cargo run --bin {package}

where {package} is one of arroyo-api, arroyo-controller, or arroyo-compiler-service.

Running rust services locally

Arroyo comes with a default ProcessScheduler which runs pipelines from the controller as a local process. The recommended development environment involves running three services: arroyo-api, arroyo-compiler-service, arroyo-controller.

arroyo-api

Start this service with

cargo run  --bin arroyo-api

The api serves the frontend on http://localhost:8000 and provides a GRPC server for interacting through the UI.

arroyo-compiler-service

Start this service with

OUTPUT_DIR=target  cargo run --bin arroyo-compiler-service start

The compiler service compiles the custom crate for each pipeline. Because Rust can take a while to compile, it reuses the same directory across compilations. By default pipelines are compiled with --release, but if you’re just doing feature development you can set DEBUG=true. This can reduce compile times by a factor of 5x, but the resulting pipeline will be notably slower.

arroyo-controller

Start this service with

REMOTE_COMPILER_ENDPOINT=http://localhost:9000 cargo run --bin arroyo-controller

This will manage the execution of pipelines.

On MacOS, you may see this error in the controller logs when trying to run a job:

Child ("/tmp/arroyo-process/m2lgldjw") exited with status Ok(ExitStatus(unix_wait_status(9)))

This means that MacOS gatekeeper is killing the pipeline binary. To allow the pipeline to run, type this command in your terminal:

$ sudo spctl --master-disable

Note that this does disable a security feature, so you may want to revert it when you’re done:

$ sudo spctl --master-enable

Building the frontend

We use pnpm and vite for frontend development.

Then you should be able to run the console in dev mode like this:

$ cd arroyo-console
$ pnpm install
$ pnpm dev

This will launch a dev server at http://localhost:5173 (note the api must also be running locally for this to be functional).

To build the console (necessary before deploying for the hosted version to work) you can run

$ pnpm build

Running the services

Arroyo requires running at least two services: arroyo-api and arroyo-controller. See here for an overview of the services and what they do.

The services can be run with cargo:

$ cargo run --bin arroyo-controller
# in a separate terminal
$ cargo run --bin arroyo-api

To get faster compilation of pipelines, you can also run the compile service:

$ cargo run --bin arroyo-compiler-service

then set the REMOTE_COMPILER_ENDPOINT environment variable when running the controller like this:

$ REMOTE_COMPILER_ENDPOINT=http://localhost:9000 cargo run --bin arroyo-controller

Compiling objects for the frontend

After altering protobuf definitions in arroyo-rpc you need to regenerate the Typescript code used by the frontend:

$ pnpm run generate

Similarly, after altering REST API types in arroyo-api you need to run:

$ pnpm run openapi