Skip to content

Conversation

@askalt
Copy link
Contributor

@askalt askalt commented Jan 26, 2026

Which issue does this PR close?

Rationale for this change

Non-idempotent rule logic leads to the artifacts described in the issue above if the rule is applied several times.

What changes are included in this PR?

  • Pass filters pushed earlier to supports_filter_pushdown(...) call.
  • Extend a scan projection if some pushed filters become unsupported.

Are these changes tested?

There are unit tests.

@github-actions github-actions bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Jan 26, 2026
Part of apache#19929

Let "t" be a table provider that supports exactly any single filter
but not a conjunction. Consider the following optimizer pipeline:

1. Try to push `a = 1, b = 1`.
   `supports_filters_pushdown` returns [Exact, Inexact]
   OK: the optimizer records that a = 1 is pushed and creates a filter node for b = 1.

...
Another optimization iteration.

2. Try to push `b = 1`.
   `supports_filters_pushdown` returns [Exact]. Of course, the table provider can't
   remember all previously pushed filters, so it has no choice but to answer `Exact`.
   Now, the optimizer thinks the conjunction a = 1 AND b = 1 is supported exactly, but
   it is not.

To prevent this problem, this patch passes filters that were already pushed into the scan
earlier to `supports_filters_pushdown`.
@askalt askalt force-pushed the askalt/filter-push-down-improve branch from 236ef40 to 6697447 Compare January 26, 2026 08:07
@askalt
Copy link
Contributor Author

askalt commented Jan 26, 2026

Will fix CI soon.

Consider the following optimizer-run scenario:

1. `supports_filters_pushdown` returns `Exact` on some filter, e.g. "a = 1",
   where the column "a" is not required by the query projection.

2. "a" is removed from the table provider projection by "optimize projection"
   rule.

3. `supports_filters_pushdown` changes a decision and returns `Inexact` on
   this filter the next time. e.g., input filters are changed and it prefers
   to use a new one.

4. "a" is not returned to the table provider projection which leads to filter
   that references a column which is not a part of the input schema.

This patch fixes issue introducing the following logic within a filter push-down rule:

1. Collect columns that are not used in the current table provider scan projection,
   but required for filter expressions. Call it `additional_projection`.

2. If `additional_projection` is empty -- leave logic as is prior the patch.

3. Otherwise extend a table provider projection and wrap a plan with
   an additional projection node to preserve schema used prior to the rule.
@askalt askalt force-pushed the askalt/filter-push-down-improve branch from 6697447 to b3c756b Compare January 26, 2026 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve filter push-down

1 participant