Arroyo provides sources and sinks for Fluvio for consistently reading and writing data from Fluvio topics. Fluvio is a distributed streaming platform built on top of Rust and Kubernetes. Fluvio has support for simple, stateless processing, but with Arroyo it can be extended to perform complex, stateful processing and analytics.

Configuring the Connection

Fluvio connections can be created via the Web UI or in SQL.

Fluvio creation flow

A Fluvio connection has several required and optional fields:

FieldDescriptionRequiredExample
endpointThe endpoint of your Fluvio cluster; if not specified the default profile is usedNomy-cluster:9003
topicThe name of the Fluvio topic to read from or write toYes
typeThe type of table (either ‘source’ or ‘sink’)Yessource
source.offsetThe offset to start reading from (either ‘earliest’ or ‘latest’)If sourceearliest

Fluvio Sources

Fluvio sources can be created via the Web UI, the API, or directly in SQL. A FLuvio source is defined by a topic name and a schema.

Schemas can be defined via json-schema or via SQL DDL statements.

Fluvio sources implement exactly-once semantics by storing the last-read offset in Arroyo’s state.

Fluvio Sinks

Fluvio sinks can be created via the Web UI, the API, or directly in SQL. A Fluvio sink is defined by a topic name and schema. Currently, Fluvio sinks only support writing JSON data, with the structure determined by the schema of the data being written.

Fluvio DDL

Fluvio sources and sinks can be created directly in SQL. For example:

CREATE TABLE source (
  value TEXT
) WITH (
  connector = 'fluvio',
  format = 'raw_string',
  topic = 'greetings',
  type = 'source'
);


CREATE TABLE sink (
  msg TEXT
) WITH (
  connector = 'fluvio',
  format = 'json',
  topic = 'fluvio-sink',
  type = 'sink'
);