Developer setup
Set up your environment to develop Arroyo
Dependencies
To build Arroyo, you will need to install some dependencies.
Follow these command to get the necessary dependencies to build Arroyo on Ubuntu:
$ sudo apt-get install pkg-config build-essential \
libssl-dev openssl cmake curl postgresql postgresql-client
$ sudo systemctl start postgresql
$ wget https://github.com/protocolbuffers/protobuf/releases/download/v21.8/protoc-21.8-linux-x86_64.zip
$ unzip protoc-21.8-linux-x86_64.zip
$ sudo mv bin/protoc /usr/local/bin
$ curl https://sh.rustup.rs -sSf | sh -s -- -y
$ cargo install refinery_cli
$ curl -fsSL https://get.pnpm.io/install.sh | sh -
Get the source
The Arroyo source can be checked out from GitHub with:
$ git clone https://github.com/ArroyoSystems/arroyo.git
Postgres
Developing Arroyo requires running a properly configured postgres instance. By default,
it expects a database called arroyo
, a user arroyo
with password arroyo
, although that
can be changed by setting the following environment variables:
DATABASE_NAME
DATABASE_HOST
DATABASE_PORT
DATABASE_USER
DATABASE_PASSWORD
On Ubuntu, you can set up a compatible database like this:
$ sudo -u postgres psql -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ sudo -u postgres createdb arroyo
On MacOS:
$ psql postgres -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ createdb arroyo
Migrations are managed using refinery. Once Postgres is set up, you can initialize it with
$ refinery setup # follow the prompts
$ mv refinery.toml ~/
$ refinery migrate -c ~/refinery.toml -p crates/arroyo-api/migrations
We use cornucopia for typed-checked SQL queries. Our build is set up to automatically re-generate the rust code for those queries on build, so you will need a DB set up for compilation as well.
Building the frontend
We use pnpm and vite for frontend development. Once pnpm is installed (following the OS-specific) instructions above, you should be able to build the console like this:
$ cd webui
$ pnpm install
$ pnpm build
This must be done before running the services, as the API service will expect to find the compiled webui source so it can serve it.
Building the services
Arroyo is built via Cargo, the Rust build tool. All services are built into a single binary, arroyo, which can be built with
$ cargo build --package arroyo
By default, Cargo builds in debug mode which gives you shorter build times in exchange for slower execution
speeds. This is great for development, but you should make sure to always build in release mode (cargo build --release
)
before deploying or benchmarking.
This will build the binary in target/debug/arroyo. Running that you should see the available subcommands:
$ target/debug/arroyo
Usage: arroyo <COMMAND>
Commands:
api Starts an Arroyo API server
controller Starts an Arroyo Controller
cluster Starts a complete Arroyo cluster
worker Starts an Arroyo worker
compiler Starts an Arroyo compiler
node Starts an Arroyo node server
migrate Runs database migrations on the configure Postgres database
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
Running a cluster
All of the services can be run together in a single cluster, or individually. For development,
it’s most typical to run them together using the cluster
subcommand:
$ target/debug/arroyo cluster
This will start a new Arroyo cluster with the default “process” scheduler, which runs each pipeline as a separate process on the same node.
You can then open http://localhost:5115 to use your new cluster.
Cargo also allows combining the build-run process into a single step, with the cargo run
command, for example:
$ cargo run --bin arroyo -- cluster
On MacOS, you may see this error in the controller logs when trying to run a job:
Child ("/tmp/arroyo-process/m2lgldjw") exited with status Ok(ExitStatus(unix_wait_status(9)))
This means that MacOS gatekeeper is killing the pipeline binary. To allow the pipeline to run, open System Settings and navigate to Privacy & Security -> Developer Tools. In the list, add and enable your terminal (for example Terminal.app or iTerm).
Front-end development
When developing the frontend, you can take advantage of vite’s dev server to shorten the dev cycle:
$ cd webui
$ pnpm run dev
Then visit http://localhost:5173/. This requires the cluster is running on http://localhost:5115.
Updating frontend definitions
After altering protobuf definitions in arroyo-rpc
you need to regenerate the Typescript code used by the frontend:
$ pnpm run generate
Similarly, after altering REST API types in arroyo-api
you need to run:
$ pnpm run openapi
Testing
Arroyo includes a large suite of tests covering most parts of the system. We use the standard Rust test framework, which can be invoked via
$ cargo test
We also recommend cargo-nextest, which provides a better UX on top of the standard test runner.
$ cargo nextest run
See the nextest docs for all options.
Connector tests may require a running local instance of that system to pass; for example the full test suite currently requires kafka and an mqtt broker.
Building docker
The Arroyo repo includes a Dockerfile to package Arroyo for development and deployment via Kubernetes or Docker Compose.
For efficiency, we use targets to build images for different purposes. The available targets are
arroyo
— includes only the Arroyo binary; most useful for deploying on Kubernetes or in Docker Composearroyo-full
— also includes a Rust compile environment; this is needed for building UDFs within the Web UI; the compiler service will dynamically fetch the necessary dependencies, but you may choose to use arroyo-full if you’re running in an environment that does not allow external internet access
To build a particular target, run
$ docker build . -f docker/Dockerfile -t {target} --target {target}
for example,
$ docker build . -f docker/Dockerfile -t arroyo-full --target arroyo-full
Note that this requires a fairly recent version of Docker that supports buildx.