Fix several LINQ gaps: DateOnly/TimeOnly projection, SelectMany Distinct().Count(), and GroupJoin Distinct/scalar/aggregates#4677
Closed
haefele-octoja wants to merge 5 commits into
Conversation
…tion Query<T>().Select(x => x.SomeDateOnly) -- and the TimeOnly and nullable variants -- threw System.Text.Json "'-' is an invalid end of a number": SelectorVisitor.ToScalar omitted DateOnly/TimeOnly from the scalar whitelist, so the projection fell to DataSelectClause and JSON-deserialized the quote-stripped `data ->> 'x'` text. Route them through NewScalarSelectClause like DateTime/DateTimeOffset so the native date/time TypedLocator (mt_immutable_date()/mt_immutable_time()) is read directly. MemberType is already the non-nullable underlying type, so DateOnly?/TimeOnly? are covered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Query<T>().SelectMany(x => x.Children).Select(c => c.Value).Distinct().Count() threw NotSupportedException "The database operator 'DISTINCT' cannot be used with non-simple types" even though the equivalent Distinct().ToList() worked. BuildSelectManyStatement re-applied DISTINCT to the count clause that ProcessSingleValueModeIfAny had already produced, and the Inner-merged IsDistinct flag never reached the count path -- so when the throw was avoided a plain non-distinct count was produced instead. Skip the tail DISTINCT operator for the Count/LongCount modes (already handled as count(*) over a DISTINCT CTE) and re-sync IsDistinct onto the statement for those modes so the distinct projection is the thing being counted. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
GroupJoin(...).SelectMany(...) had two gaps:
- Distinct() was silently dropped over the join. JoinSelectClause was neither
IScalarSelectClause nor ICountClause, so neither distinct branch in
ProcessSingleValueModeIfAny matched: Distinct().Count() returned the full joined
row count, and Distinct().ToList() returned non-distinct rows.
- A bare scalar result selector ((x, t) => x.l.Id) built an empty NewObject and threw
"Sequence contains no elements" at materialization -- JoinSelectParser only handled
new {...} / MemberInit object projections.
Fix:
- JoinSelectClause implements IScalarSelectClause and renders DISTINCT(<projection>)
when the DISTINCT operator is applied, reusing the standard distinct / count-over-CTE
machinery for both materialized Distinct() and Distinct().Count()/LongCount().
- CompileGroupJoin transfers IsDistinct from the SelectMany usage chain onto the join
statement before ProcessSingleValueModeIfAny.
- JoinSelectParser renders a bare scalar result selector as to_jsonb(<scalar>) so the
existing SerializationSelector<T> deserializes it and DISTINCT applies unchanged.
- The join selector returns default for a null "data" column (a bare scalar whose value
is null, e.g. a left join projecting an inner member of an unmatched row) rather than
throwing, matching Marten's scalar-select convention.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Coverage review additions (tests only): - GroupJoin: dedup correctness (a multi-field object Distinct() must keep rows that differ in one field -- not over-merge); bare scalar projections across string, decimal, enum and DateOnly (to_jsonb round-trip); and a clear-error test for a non-translatable computed scalar body (BadLinqExpressionException). - DateOnly/TimeOnly projection: a Newtonsoft serializer variant (the fix is serializer-agnostic) and Distinct() over the projected value. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A bare scalar projection over a join -- (x, c) => c.Amount -- could already be materialized, distinct'd and counted, but Sum/Min/Max/Average threw NotSupportedException: the scalar is rendered as to_jsonb(...) so it can be deserialized/deduped, and Postgres has no sum/min/max/avg aggregate for jsonb. Render the raw (un-wrapped) scalar into the join CTE and put a standard scalar select clause (NewScalarSelectClause / NewScalarStringSelectClause) over it, so the existing aggregate machinery handles everything -- result types (SUM(int)->bigint, AVG->double via CloneToDouble), enums, nullable scalars (the clause is built over the underlying type and also implements ISelector<T?>), and null inner rows from a left join (SUM ignores them). Object-projection aggregates remain unsupported (there is no single column to aggregate) and throw a clear error pointing at the bare-scalar shape. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jeremydmiller
added a commit
that referenced
this pull request
Jun 8, 2026
…ct/Where Addresses the two "Known limitations" that PR #4677 explicitly left open: 1. **Aggregates over an *object* projection.** `.SelectMany(.., (x, c) => new {..}).Sum(z => z.X)` (which QueryableExtensions lowers to `.Select(z => z.X).SumAsync()`) used to hit `JoinSelectClause.ApplyOperator("SUM")` on the object projection and throw. 2. **`.Select(...)` / `.Where(...)` *after* the join's `.SelectMany(...)`.** `CompileGroupJoin` ignored both: a post-SelectMany Select returned the original anon-typed rows and a post-SelectMany Where was silently dropped. The fix is a single seam in `CompileGroupJoin`: a new `AnonProjectionExpander` visitor walks the FlattenedResultSelector's `(x, c) => new { Member = source }` bindings and rewrites a `z.Member` access (on a parameter of the projection's anon type) back to its source expression in (x, c) terms. With the expander in hand `CompileGroupJoin` reduces a post-SelectMany Select into a synthesized effective result selector that `JoinSelectParser` then renders normally — including lighting up the bare-scalar aggregate path that already supports Sum/Min/Max/Average. Post-SelectMany Where filters get expanded the same way and routed onto the appropriate CTE's existing Where pipeline (inner-side onto `InnerCollectionUsage.WhereExpressions` before its `ParseWhereClause` runs, outer-side onto `outerStatement.ParseWhereClause` after the fact with `storage=null` so we don't double-apply the tenant / soft-delete defaults). Cross-side Where filters (touching both x.* and c.*) are pinned as a clear `BadLinqExpressionException` rather than silently mis-translated — that's a join-level WHERE clause feature for another PR if anyone hits it. Computed expressions on the projected members (`z.Amount * 2`) also stay pinned: that's the existing "computed scalar projection" limitation the upstream PR already locked in. 11 new tests in `Bug_4677_groupjoin_post_selectmany.cs` cover: * object-projection Sum / Min / Max / Average over inner and outer members, decimal scalars, and left-join with NULL inner rows (SQL SUM ignores NULL); * post-SelectMany Select reducing to a scalar member, and reshuffling to a new anonymous type (without compute); * post-SelectMany Where on the inner side, the outer side, and chained with Sum; * the both-sides Where clear-error pin. Existing 53 tests across `Bug_groupjoin_distinct_and_scalar_projection`, `Bug_distinct_count_over_selectmany`, `Bug_dateonly_timeonly_scalar_projection`, and the `group_join_operator` family stay green. Full `LinqTests` suite passes on both net9.0 and net10.0 (1312 / 1313 — 1 unrelated pre-existing skip). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Member
|
This was superseded by #4677, but it's in. Thank you @haefele-octoja! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This fixes a few related LINQ-translation gaps I ran into while using Marten, each with a focused regression test. They're independent, so I'm very happy to split this into separate PRs, rework it in whatever direction you'd prefer, or drop it entirely if any of it is the wrong approach — no attachment to the implementation, I just wanted to surface the bugs with repros.
The changes are ~220 lines across 4 source files in
src/Marten/Linq, plus 32 tests in threeBug_*files.What's fixed
1.
DateOnly/TimeOnlyscalar projectionthrew
System.Text.Json.JsonException: '-' is an invalid end of a numberon non-empty results.SelectorVisitor.ToScalardidn't listDateOnly/TimeOnlyamong the scalar types, so the projection fell through toDataSelectClause, which selects the quote-strippeddata ->> 'x'text and JSON-deserializes it. Fix: add them to the scalar branch (mirroringDateTime) so the value is read from the member's nativemt_immutable_date()/mt_immutable_time()locator. Nullable variants covered.2.
Count()/LongCount()over aDistinct()of aSelectMany()projectionthrew
The database operator 'DISTINCT' cannot be used with non-simple typeseven though.Distinct().ToList()returned the correct distinct set.BuildSelectManyStatementre-appliedDISTINCTto the count clauseProcessSingleValueModeIfAnyhad already produced, and theInner-mergedIsDistinctnever reached the count path. Fix: skip the tailDISTINCTforCount/LongCountand re-syncIsDistinctfor those modes.3.
GroupJoin(...).SelectMany(...)—Distinct, scalar projections, and aggregatesAll of the following used to throw or return wrong results, and now work:
The three underlying issues:
Distinct()was silently dropped.JoinSelectClausewas neitherIScalarSelectClausenorICountClause, so neither distinct branch inProcessSingleValueModeIfAnymatched:Distinct().Count()returned the full joined row count, andDistinct().ToList()returned non-distinct rows. Fix:JoinSelectClauseimplementsIScalarSelectClause(rendersDISTINCT(<projection>)), andCompileGroupJointransfersIsDistinctfrom the SelectMany usage. Reuses the existing distinct / count-over-CTE machinery.(x, o) => o.Amountbuilt an emptyNewObject→Sequence contains no elementsat materialization (JoinSelectParseronly handlednew {...}projections). Fix: render a bare scalar asto_jsonb(<scalar>)so the existingSerializationSelector<T>deserializes it; the join selector also returnsdefaultfor a nulldatacolumn (a left-join unmatched inner row) instead of throwing.Sum/Min/Max/Averageover a scalar projection threw. The scalar is rendered asto_jsonb(...)and Postgres has nosum/min/max/avgforjsonb. Fix: for an aggregate, render the raw scalar into the join CTE and put a standardNewScalarSelectClauseover it, so the existing aggregate machinery handles result types (SUM(int)->bigint,AVG->double), enums, nullable scalars, and left-join nulls.Known limitations (left in place, fail with a clear error — not silently)
.SelectMany(... => new {...}).Sum(z => z.Amount)) — there's no single column to aggregate. Workaround: project the scalar directly in the result selector..Select()/.Where()after the join'sSelectMany—CompileGroupJoinignores a post-SelectManySelectExpression. Supporting it would need a synthesized member collection mapping the anonymous result type's members onto the join's jsonbdatacolumn; I judged that a separate, larger feature and left it as a documented limitation (workaround: put the projection in theSelectManyresult selector). Happy to take a run at it if you think it's worthwhile.Testing
Bug_dateonly_timeonly_scalar_projection,Bug_distinct_count_over_selectmany, andBug_groupjoin_distinct_and_scalar_projection. Each headline bug was verified red→green (fails without the fix, passes with it).LinqTestssuite passes on both net9.0 and net10.0 (1301 passing), including all existinggroup_join_operatortests — no regressions.Distinctfor silent-wrong-result shapes (multi-field dedup correctness, type round-trips under both System.Text.Json and Newtonsoft, left-join nulls) — none found.Notes for reviewers
These are independent fixes; I bundled them only because I found them together. Please tell me if you'd rather I split them into separate PRs (the DateOnly/TimeOnly and SelectMany-
Distinct().Count()ones are small and standalone; the GroupJoin change is the larger one), rework any of them, or drop them. I'll happily adapt.🤖 Generated with Claude Code