fix subquery where exists distinct#3732
Conversation
| .map_err(|e| context!("single expression projection required", e))?; | ||
| let subqry_filter = Filter::try_from_plan(subqry_input) | ||
| .map_err(|e| context!("cannot optimize non-correlated subquery", e))?; | ||
| let subqry_filter = match subqry_input { |
There was a problem hiding this comment.
I don't understand why this fixes the error -- what was in the projection in the subquery that caused the problem?
There was a problem hiding this comment.
I have been looking at this as well. The existing code works for a very simple projection but does not work if the projection is wrapped in any other operator, such as Distinct, Filter, Limit, Sort, and so on.
There was a problem hiding this comment.
Although in this case it is now looking for a projection wrapping a filter and isn't looking for distinct so I am also confused.
There was a problem hiding this comment.
I understand what is happening now and have some suggestions for improving this rule.
This line of code looks at inputs of the subquery and does not care what type of operator the subquery is. Previously this was assumed to be a Projection but now it could be a Projection or a Distinct, or something else ... I think we should add some pattern matching here.
let subqry_inputs = query_info.query.subquery.inputs();We are then matching on this input and previously expected a Filter buit now could be a Projection containing a Filter because everything is shifted down by one because of the root Distinct.
There was a problem hiding this comment.
I would like to see something with explicit pattern matching to make sure we are only supporting intended cases. Here is my attempt:
fn optimize_exists(
query_info: &SubqueryInfo,
outer_input: &LogicalPlan,
outer_other_exprs: &[Expr],
) -> datafusion_common::Result<LogicalPlan> {
let subqry_filter = match query_info.query.subquery.as_ref() {
LogicalPlan::Distinct(subqry_distinct) => match subqry_distinct.input.as_ref() {
LogicalPlan::Projection(subqry_proj) => Filter::try_from_plan(&*subqry_proj.input),
_ => Err(DataFusionError::NotImplemented("todo: error message".to_string()))
}
LogicalPlan::Projection(subqry_proj) => Filter::try_from_plan(&*subqry_proj.input),
_ => Err(DataFusionError::NotImplemented("todo: error message".to_string()))
}.map_err(|e| context!("cannot optimize non-correlated subquery", e))?;There was a problem hiding this comment.
done, thanks for your advice. @andygrove
|
labeler CI failure is unrelated #3743 |
|
Benchmark runs are scheduled for baseline = de9c7c5 and contender = 1e1de82. 1e1de82 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3724
Rationale for this change
What changes are included in this PR?
If the
planisDistinct, get theFilterfromProjectionAre there any user-facing changes?