Is your feature request related to a problem or challenge?
Implement collect_list, array_agg equivalent of spark.
Note that in Spark, the array_agg is an alias of collect_list, link here.
Also note that, Datafusion also support array_agg, however, there seems to a difference in behaviour and syntax with Spark.
For eg. Datafusion support ORDER BY within array_agg, link, and can provide deterministic ordering. Spark on the other hand, doesn't support ORDER BY within array_agg and does not ensure deterministic ordering. Spark doc explicitly mentions this for all 2 functions - The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle., link.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem or challenge?
Implement collect_list, array_agg equivalent of spark.
Note that in Spark, the
array_aggis an alias ofcollect_list, link here.Also note that, Datafusion also support
array_agg, however, there seems to a difference in behaviour and syntax with Spark.For eg. Datafusion support
ORDER BYwithinarray_agg, link, and can provide deterministic ordering. Spark on the other hand, doesn't supportORDER BYwithinarray_aggand does not ensure deterministic ordering. Spark doc explicitly mentions this for all 2 functions -The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle., link.Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response