Developer setup
Dependencies
To build Arroyo, you will need to install some dependencies.
Ubuntu
$ sudo apt-get install pkg-config build-essential libssl-dev openssl cmake curl postgresql postgresql-client
$ sudo systemctl start postgresql
$ wget https://github.com/protocolbuffers/protobuf/releases/download/v21.8/protoc-21.8-linux-x86_64.zip
$ unzip protoc-21.8-linux-x86_64.zip
$ sudo mv bin/protoc /usr/local/bin
$ curl https://sh.rustup.rs -sSf | sh -s -- -y
$ cargo install wasm-pack
$ cargo install refinery_cli
$ curl -fsSL https://get.pnpm.io/install.sh | sh -
$ pnpm install --global @openapitools/openapi-generator-cli
MacOS
First, install Homebrew. Then, run the following commands:
$ brew install postgresql
$ brew install protobuf
$ brew install pnpm
$ brew install cmake
Then install Rust
$ curl https://sh.rustup.rs -sSf | sh -s -- -y
Finally, install the following packages:
$ cargo install wasm-pack
$ cargo install refinery_cli
$ cargo install cargo-nextest
$ pnpm install --global @openapitools/openapi-generator-cli
Postgres
Developing Arroyo requires running a properly configured postgres instance. By default,
it expects a database called arroyo
, a user arroyo
with password arroyo
, although that
can be changed by setting the following environment variables:
DATABASE_NAME
DATABASE_HOST
DATABSE_PORT
DATABASE_USER
DATABASE_PASSWORD
On Ubuntu, you can setup a compatible database like this:
$ sudo -u postgres psql -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ sudo -u postgres createdb arroyo
On MacOS:
$ psql postgres -c "CREATE USER arroyo WITH PASSWORD 'arroyo' SUPERUSER;"
$ createdb arroyo
Migrations are managed using refinery. Once Postgres is set up, you can initalize it with
$ refinery setup # follow the prompts
$ mv refinery.toml ~/
$ refinery migrate -c ~/refinery.toml -p arroyo-api/migrations
We use cornucopia for typed-checked SQL queries. Our build is set up to automatically re-generate the rust code for those queries on build, so you will need a DB set up for compilation as well.
Building the services
Arroyo is built via Cargo, the Rust build tool. To build all of the services in release mode, you can run
$ cargo build --release
The control plane services (api, controller, and compiler) are not resource-intensive, and can be run without issue in debug mode for development:
$ cargo run --bin {package}
where {package}
is one of arroyo-api
, arroyo-controller
, or arroyo-compiler-service
.
Running rust services locally
Arroyo comes with a default ProcessScheduler which runs pipelines from the controller as a local process. The recommended development environment involves running three services: arroyo-api, arroyo-compiler-service, arroyo-controller.
arroyo-api
Start this service with
cargo run --bin arroyo-api
The api serves the frontend on http://localhost:8000 and provides a GRPC server for interacting through the UI.
arroyo-compiler-service
Start this service with
OUTPUT_DIR=target cargo run --bin arroyo-compiler-service start
The compiler service compiles the custom crate for each pipeline.
Because Rust can take a while to compile, it reuses the same directory across compilations.
By default pipelines are compiled with --release
,
but if you’re just doing feature development you can set DEBUG=true
.
This can reduce compile times by a factor of 5x,
but the resulting pipeline will be notably slower.
arroyo-controller
Start this service with
REMOTE_COMPILER_ENDPOINT=http://localhost:9000 cargo run --bin arroyo-controller
This will manage the execution of pipelines.
On MacOS, you may see this error in the controller logs when trying to run a job:
Child ("/tmp/arroyo-process/m2lgldjw") exited with status Ok(ExitStatus(unix_wait_status(9)))
This means that MacOS gatekeeper is killing the pipeline binary. To allow the pipeline to run, type this command in your terminal:
$ sudo spctl --master-disable
Note that this does disable a security feature, so you may want to revert it when you’re done:
$ sudo spctl --master-enable
Building the frontend
We use pnpm and vite for frontend development.
Then you should be able to run the console in dev mode like this:
$ cd arroyo-console
$ pnpm install
$ pnpm dev
This will launch a dev server at http://localhost:5173 (note the api must also be running locally for this to be functional).
To build the console (necessary before deploying for the hosted version to work) you can run
$ pnpm build
Running the services
Arroyo requires running at least two services: arroyo-api and arroyo-controller. See here for an overview of the services and what they do.
The services can be run with cargo:
$ cargo run --bin arroyo-controller
# in a separate terminal
$ cargo run --bin arroyo-api
To get faster compilation of pipelines, you can also run the compile service:
$ cargo run --bin arroyo-compiler-service
then set the REMOTE_COMPILER_ENDPOINT
environment variable when running the controller like this:
$ REMOTE_COMPILER_ENDPOINT=http://localhost:9000 cargo run --bin arroyo-controller
Compiling objects for the frontend
After altering protobuf definitions in arroyo-rpc
you need to regenerate the Typescript code used by the frontend:
$ pnpm run generate
Similarly, after altering REST API types in arroyo-api
you need to run:
$ pnpm run openapi