Rust UDAFs
Arroyo SQL supports User-Defined Aggregate Functions, allowing you to extend the system’s capabilities with custom aggregation logic written in Rust. UDAFs are deployed and managed in the same way as scalar UDFs; see the UDF docs for details.
Currently UDAFs are only supported in Rust.
Defining a UDAF
Section titled “Defining a UDAF”UDAFs are defined as functions that take some number of Vec<T> as an
argument, where T is any valid SQL data type. Note
that all arguments must be Vec—you can’t mix-and-match vector and scalar
parameters.
Example
Section titled “Example”Here’s a simple UDAF for calculating the median of a set of integers:
use arroyo_sql_plugin::udf;
#[udf]fn my_median(mut args: Vec<i64>) -> Option<f64> { if args.is_empty() { return None; }
args.sort();
let mid = args.len() / 2; if args.len() % 2 == 0 { Some((args[mid] + args[mid - 1]) as f64 / 2.0) } else { Some(args[mid] as f64) }}Once defined, a UDAF can be used within a SQL query like the built-in aggregate functions:
SELECT my_median(bid.price), bid.auction, SESSION(interval '10 second') as sessionFROM nexmarkGROUP BY 2,3Nullability
Section titled “Nullability”If a UDAF argument type is an Option (e.g., Vec<Option<i64>>), the UDAF will
be invoked with all inputs, including NULLs. Otherwise, only non-null
arguments will be passed down to the UDAF. Unlike UDFs, UDAFs will always be
invoked, although possibly with empty input vectors.