Build your first Arroyo pipeline
nexmark
, and click “Test
Connection,” then “Create” to finish creating the source.
hop
in SQL) to perform a
time-oriented aggregation:
INSERT INTO
statement),
Arroyo will automatically add a Web sink so that we can view the results in the Web UI. Click “Start” to create
the pipeline.
This will start the pipeline. Once it’s running, we can click nodes on the pipeline dataflow graph and
see metrics for that operator. Clicking into the Outputs tab we can tail the results for the pipeline. The Checkpoints
tab shows metadata for the checkpoints for the pipeline. Arroyo regularly takes consistent checkpoints of the state
of the pipeline (using a variation of asynchronous barrier snapshotting algorithm described in
this paper) so we can recover from failure.
We can also control execution of the pipeline, stopping and starting it. Before stopping the pipeline Arroyo takes
a final checkpoint so that we can restart it without any data loss.