Write Parquet files to a Delta Lake Table
Data from Arroyo can be written to a Delta Lake table using the delta
sink.
This sink shares most of its code with the File System Sink,
and as such, it supports all of the same configuration options.
However, it has the following constraints:
format
option must be set to parquet
, as Delta Lake only supports Parquet files.Arroyo commits to Delta Lake tables using the same two-phase commit protocol as the File System Sink. The files to be written are staged and then committed after taking a reliable checkpoint. Idempotence is ensured by validating the table version before appending and, if different than expected, checking all intervening versions to ensure that the data was not already added.
Write Parquet files to a Delta Lake Table
Data from Arroyo can be written to a Delta Lake table using the delta
sink.
This sink shares most of its code with the File System Sink,
and as such, it supports all of the same configuration options.
However, it has the following constraints:
format
option must be set to parquet
, as Delta Lake only supports Parquet files.Arroyo commits to Delta Lake tables using the same two-phase commit protocol as the File System Sink. The files to be written are staged and then committed after taking a reliable checkpoint. Idempotence is ensured by validating the table version before appending and, if different than expected, checking all intervening versions to ensure that the data was not already added.