Cosmos: single-property projection no longer implicitly filters undefined#38405
Cosmos: single-property projection no longer implicitly filters undefined#38405Copilot wants to merge 6 commits into
Conversation
…ned; update OwnedNavigations and OwnedQuery baselines Co-authored-by: AndriySvyryd <6539701+AndriySvyryd@users.noreply.github.com>
…le-property fix Co-authored-by: AndriySvyryd <6539701+AndriySvyryd@users.noreply.github.com>
Co-authored-by: AndriySvyryd <6539701+AndriySvyryd@users.noreply.github.com>
Co-authored-by: AndriySvyryd <6539701+AndriySvyryd@users.noreply.github.com>
| // navigation or an EF.Functions.IsDefined check), the optimal VALUE projection is preserved. | ||
| if (clientProjection | ||
| && _projection is [{ IsValueProjection: true, Expression: ScalarAccessExpression { Object: ObjectAccessExpression } scalarAccess } valueProjection] | ||
| && !IsGuaranteedDefinedByPredicate(scalarAccess, Predicate)) |
There was a problem hiding this comment.
@copilot Instead of checking this on the translated SqlExpression this should be done before, when a Projection of the matching shape is created you try to apply this optimization and if the query matches, remove the filter predicates and mark the projection as scalar (meaning that it cannot be undefined, so it can be translated to VALUE ...)
There was a problem hiding this comment.
To make sure I build the right thing in one pass, here's the plan I'd follow — please confirm or correct:
- Location — move the logic out of
ApplyProjection(postprocessor) into the translation phase where the single nested-scalar projection mapping is created (CosmosProjectionBindingExpressionVisitor/TranslateSelect), so the matching-shape detection happens when the projection is built rather than by re-walking the finalPredicate. - Optimization when the shape matches — when the projection is a single scalar nested in an embedded object (
ScalarAccessExpressionwhoseObjectis anObjectAccessExpression) and the predicate guarantees that path is defined: drop the now-redundant definedness-guard conjuncts (IS_DEFINED(path)andancestor != nullalong the path chain) from the predicate, and mark the projection scalar so it staysSELECT VALUE .... - Default — otherwise the matching-shape projection becomes an object projection (
SELECT c[...]) so undefined surfaces consistently (the main fix).
Two things I want to confirm:
- For guards that imply definedness but aren't pure null/IS_DEFINED checks and so can't be removed (e.g.
...["Country"] = "USA"inNavigation_rewrite_on_owned_reference_projecting_scalar): keep the predicate as-is but still mark the projection scalar (VALUE)? That preserves the current optimal baseline. - "remove the filter predicates" — drop only the guard conjunct(s) on the projected path/ancestors and keep any unrelated conjuncts, correct?
If that matches your intent I'll implement it; the two new tests' baselines will lose their WHERE (replaced by VALUE-only filtering).
There was a problem hiding this comment.
@copilot For guards that imply definedness mark the projection scalar for consistency even though the result from the query would be the same either way.
Right, drop only the guard conjunct(s) on the projected path/ancestors and keep any unrelated conjuncts. Don't drop the guard conjunct(s) if they affect the unrelated conjuncts.
There was a problem hiding this comment.
Done in the latest commit. The optimization is now applied when the single nested-scalar projection is created (SelectExpression.ApplyProjection): TryMakeNestedScalarProjectionDefined walks the top-level AND-conjuncts and, when the projected path is guaranteed defined, drops the redundant definedness guards (IS_DEFINED(path) / path != null on the scalar or any ancestor object) and keeps the SELECT VALUE projection. Value comparisons that merely imply definedness (e.g. ...["Country"] = "USA") and any unrelated conjuncts are retained; otherwise the projection is demoted to an object projection. The two guarded-projection baselines now have their WHERE dropped accordingly.
… definedness guards Co-authored-by: AndriySvyryd <6539701+AndriySvyryd@users.noreply.github.com>
The Cosmos provider translated single-property anonymous/dto/scalar projections of a nested property as
SELECT VALUE prop, which causes Cosmos to silently drop documents where the path is undefined (e.g. an optional navigation is null). Multi-property projections instead throwNullable object must have a valuefor undefined non-nullable values. Behavior should be consistent based on projection syntax, not vary with the number of selected properties.After this change both forms behave the same: undefined surfaces consistently (throws for non-nullable, yields
nullfor nullable).Changes
SelectExpression.ApplyProjection(bool clientProjection = false): when the projection is a single nested-scalar VALUE projection (aScalarAccessExpressionwhoseObjectis anObjectAccessExpression), demote it to an object projection — emittingSELECT c["a"]["b"]instead ofSELECT VALUE c["a"]["b"]— so the document is retained and undefined reaches the shaper.CosmosQueryTranslationPostprocessor: passesclientProjection: trueso the demotion applies only to the top-level projection. Subqueries and collection projections keepVALUEsemantics, which their shaping depends on.x.Name, whereObjectis anObjectReferenceExpression) are intentionally left asVALUEprojections — they cannot be undefined-by-nesting.SELECT VALUEprojection is preserved — and the now-redundant guards are removed — when the predicate already guarantees the projected path cannot be undefined. The optimization is applied when the single nested-scalar projection is created (inApplyProjection): a newTryMakeNestedScalarProjectionDefinedhelper walks the top-level AND-conjuncts of theWherepredicate and, when the projected path is guaranteed defined, keeps theVALUEprojection and rewrites the predicate. Pure definedness guards (IS_DEFINED(path)orpath != null) on the scalar or any ancestor object along its path are dropped, since theVALUEprojection already filters those rows; value comparisons that merely imply definedness (e.g....["Country"] = "USA") and any unrelated conjuncts are kept, with the projection still marked scalar for consistency. Navigation null-checks (x.Associate.NestedAssociate != null) compare aStructuralTypeProjectionExpressionwhose.Objectis theObjectAccessExpression, so the helper unwraps it. As a result, guarded queries likeWhere(x => x.Associate != null && x.Associate.NestedAssociate != null).Select(x => x.Associate.NestedAssociate.Id)and theEF.Functions.IsDefinedequivalent emitSELECT VALUE c["Associate"]["NestedAssociate"]["Id"]with the redundantWHEREremoved.Tests
Updated Cosmos baselines/overrides where the generated SQL drops
VALUEand where the new behavior throws instead of filtering:ComplexProperties/OwnedNavigationsprojection suites,OwnedQuery,JsonQuery,ComplexType, andComplexTypeToJsonProperty. The issue's single-member anonymous case is covered byNon_nullable_property_through_optional_navigation, which now throwsInvalidOperationExceptionmatching the relational baseline.Added
Select_nested_scalar_guarded_by_navigation_predicate_uses_VALUEandSelect_nested_scalar_guarded_by_IsDefined_uses_VALUEto theOwnedNavigationsandComplexPropertiesprojection suites, asserting that guarded queries keep the optimalSELECT VALUEprojection with the redundant definedness guards dropped from theWHERE. TheNavigation_rewrite_on_owned_reference_projecting_scalarbaseline (guarded by a= "USA"value comparison) keeps itsSELECT VALUEprojection and itsWHEREunchanged.