Connectors
Delta Lake Sink
Write Parquet files to a Delta Lake Table
Data from Arroyo can be written to a Delta Lake table using the delta
sink.
This sink shares most of its code with the File System Sink,
and as such, it supports all of the same configuration options.
However, it has the following constraints:
- The
format
option must be set toparquet
, as Delta Lake only supports Parquet files. - Delta can only be used as a sink, not a source.
- Partitioning is not integrated with the delta log.
Example query
Commit Behavior
Arroyo commits to Delta Lake tables using the same two-phase commit protocol as the File System Sink. The files to be written are staged and then committed after taking a reliable checkpoint. Idempotence is ensured by validating the table version before appending and, if different than expected, checking all intervening versions to ensure that the data was not already added.