-
Notifications
You must be signed in to change notification settings - Fork 2k
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
concat_wsreturnsUtf8, regardless of the input types it is called with. So if it is called withLargeUtf8, we might overflow. In general, functions like these should operate on all three string representations unless there is a compelling reason not to.simplify_concat_wscoerces literals toUtf8. Again, we should generally preserve the original string type.
To Reproduce
Note the
DataFusion CLI v52.1.0
> CREATE TABLE test_views AS
SELECT arrow_cast('hello', 'Utf8View') AS a, arrow_cast('world', 'Utf8View') AS b;
0 row(s) fetched.
Elapsed 0.027 seconds.
> EXPLAIN SELECT * FROM test_views WHERE concat(a, b) = a;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ FilterExec │ |
| | │ -------------------- │ |
| | │ predicate: │ |
| | │ concat(a, b) = a │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 272 │ |
| | │ format: memory │ |
| | │ rows: 1 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.010 seconds.
> EXPLAIN SELECT * FROM test_views WHERE concat_ws(',', a, b) = a;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ FilterExec │ |
| | │ -------------------- │ |
| | │ predicate: │ |
| | │ CAST(concat_ws(,, a, b) AS│ |
| | │ Utf8View) = a │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 272 │ |
| | │ format: memory │ |
| | │ rows: 1 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.007 seconds.
> explain SELECT concat(a, concat(a, b)) FROM test_views;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ ProjectionExec │ |
| | │ -------------------- │ |
| | │ concat(test_views.a,concat│ |
| | │ (test_views.a,test_views │ |
| | │ .b)): │ |
| | │ concat(a, concat(a, b)) │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 272 │ |
| | │ format: memory │ |
| | │ rows: 1 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.005 seconds.
> explain SELECT concat(a, concat_ws(',', a, b)) FROM test_views;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ ProjectionExec │ |
| | │ -------------------- │ |
| | │ concat(test_views.a │ |
| | │ ,concat_ws(Utf8(", │ |
| | │ "),test_views.a │ |
| | │ ,test_views.b)): │ |
| | │ concat(a, CAST(concat_ws(,│ |
| | │ , a, b) AS Utf8View)) │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 272 │ |
| | │ format: memory │ |
| | │ rows: 1 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.006 seconds.
Expected behavior
No response
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working