Setting up an Arroyo cluster on VMs or bare-metal servers
--release
flag when calling cargo build
. You may also want to build on a machine with the same CPU
as you plan to deploy with and use the env var RUSTFLAGS='-C target-cpu=native'
to get the best performance
at the cost of portability to CPUs with different micro-architectures.ghcr.io/arroyosystems/arroyo:latest
.
database.type
config option or via
ARROYO__DATABASE__TYPE
env var.
If you are using Postgres you will need to run the database migrations on your
database before starting the cluster.
By default, Arroyo will expect a database called arroyo
, a user account arroyo
with password arroyo
at localhost:5432
. These can be configured via the database config options.
process
and node
. The scheduler is selected via the controller.scheduler
config option or the ARROYO__CONTROLLER__SCHEDULER
environment variable.
controller.scheduler=process
or with no
scheduler configuration.
node
process, which are able to schedule work,
and a control plane running with controller.scheduler=node
.
A node can be run via the arroyo binary or Docker image:
ARROYO__CONTROLLER_ENDPOINT
configuration with the host and port that the controller is running on.
Note that the node should always be run within a process manager that restarts
it, like systemd; nodes are designed to restart when they lose connection to the controller.
Nodes can be configured with a given number of slots, via the node.tasks-slots
config. This controls how many parallel subtasks can run on that node;
typically, you would want to set this to the number of CPUs but this can be somewhat
hardware and workload dependent.