[SPARK-56949][SQL] DecimalAggregates drops evalMode when rewriting Sum/Average#56009
[SPARK-56949][SQL] DecimalAggregates drops evalMode when rewriting Sum/Average#56009yaooqinn wants to merge 1 commit into
Conversation
| } | ||
| case ae @ AggregateExpression(af, _, _, _, _) => af match { | ||
| case Sum(WidenedDecimalChild(inner, p, pPrime, s), _) | ||
| case s @ Sum(WidenedDecimalChild(inner, p, pPrime, s_scale), _) |
There was a problem hiding this comment.
Does this bug affect the above case Sum(e @ DecimalExpression(prec, scale), _) and case Average(e @ DecimalExpression(prec, scale), _) branches?
|
@yaooqinn, I think the issue is real, but is the example in description valid? |
02cc498 to
b95abfb
Compare
|
Thanks @peter-toth. Two updates:
Force-pushed to b95abfb. |
…m/Average ### What changes were proposed in this pull request? In `Optimizer.DecimalAggregates`, rewrite the 6 `Sum` / `Average` rewrites from the single-arg helper ctor (`Sum(child)` / `Average(child)`) to `original.copy(child = UnscaledValue(...))`, so sibling fields like `evalMode` are preserved instead of re-read from `SQLConf`. The 6 sites are 4 Aggregate arms (SUM/AVG x un-widened/widened peel) plus 2 Window arms (un-widened SUM and AVG); both Window arms are directly reachable because the input is a bare `AttributeReference`, not a hoisted composite Cast. The helper ctors themselves are retained — used by other call sites. ### Why are the changes needed? Latent bug since SPARK-38432 added `evalMode` to `Sum` / `Average`: the helper ctor re-reads `EvalMode` from `SQLConf`, silently dropping `EvalMode.TRY` from `try_sum` / `try_avg`. `Sum.shouldTrackIsEmpty = evalMode == TRY` then drops the `isEmpty` aggregation buffer column, so overflow no longer returns NULL. ### Does this PR introduce _any_ user-facing change? Yes. `try_sum` / `try_avg` on narrow-precision decimals now preserve `EvalMode.TRY` through the optimizer rewrite, so overflow returns NULL as documented instead of wrapping (`ansi=false`) or throwing (`ansi=true`). Affected surface: SUM `precision <= 8`, AVG `precision <= 7` (both Aggregate and Window arms). This is observable directly in the optimized plan — the rewritten Sum / Average node carries `evalMode = TRY` post-fix vs `evalMode = ANSI` (the SQLConf default, re-read by the helper ctor) pre-fix. ### How was this patch tested? 6 new unit tests in `DecimalAggregatesSuite` asserting `evalMode == TRY` post-rewrite for: un-widened Aggregate SUM, un-widened Aggregate AVG, widened-cast Aggregate SUM, widened-cast Aggregate AVG, un-widened Window SUM, un-widened Window AVG. - Pre-fix: 6/6 FAIL with `Expected evalMode=TRY post-rewrite, got ANSI` - Post-fix: 37/37 PASS (6 new + 31 existing, 0 regression) - Sibling regressions clean: `DecimalPrecisionSuite` + `AggregateOptimizeSuite` 20/20 PASS - `dev/lint-scala` clean **Backport candidates**: branch-3.5 and branch-4.0 carry the 4 helper-ctor sites (2 Aggregate + 2 Window) — verified by `git grep "aggregateFunction = (Sum|Average)\(UnscaledValue"`. The 2 widened-cast Aggregate sites are master-only (SPARK-56627). Backport patches would touch only the 4 shared sites. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Closes #56009 from yaooqinn/users/kentyao/decimal-aggregate-evalmode-preserve. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Kent Yao <kentyao@microsoft.com> (cherry picked from commit 9732587) Signed-off-by: Kent Yao <kentyao@microsoft.com>
|
Merged to master and 4.x, thank you @peter-toth |
What changes were proposed in this pull request?
In
Optimizer.DecimalAggregates, rewrite the 6Sum/Averagerewritesfrom the single-arg helper ctor (
Sum(child)/Average(child)) tooriginal.copy(child = UnscaledValue(...)), so sibling fields likeevalModeare preserved instead of re-read fromSQLConf. The 6 sitesare 4 Aggregate arms (SUM/AVG x un-widened/widened peel) plus 2 Window
arms (un-widened SUM and AVG); both Window arms are directly reachable
because the input is a bare
AttributeReference, not a hoistedcomposite Cast.
The helper ctors themselves are retained — used by other call sites.
Why are the changes needed?
Latent bug since SPARK-38432 added
evalModetoSum/Average: thehelper ctor re-reads
EvalModefromSQLConf, silently droppingEvalMode.TRYfromtry_sum/try_avg.Sum.shouldTrackIsEmpty = evalMode == TRYthen drops theisEmptyaggregation buffer column, sooverflow no longer returns NULL.
Does this PR introduce any user-facing change?
Yes.
try_sum/try_avgon narrow-precision decimals now preserveEvalMode.TRYthrough the optimizer rewrite, so overflow returns NULLas documented instead of wrapping (
ansi=false) or throwing(
ansi=true). Affected surface: SUMprecision <= 8, AVGprecision <= 7(both Aggregate and Window arms).
This is observable directly in the optimized plan — the rewritten Sum /
Average node carries
evalMode = TRYpost-fix vsevalMode = ANSI(theSQLConf default, re-read by the helper ctor) pre-fix.
How was this patch tested?
6 new unit tests in
DecimalAggregatesSuiteassertingevalMode == TRYpost-rewrite for: un-widened Aggregate SUM, un-widened Aggregate AVG,
widened-cast Aggregate SUM, widened-cast Aggregate AVG, un-widened
Window SUM, un-widened Window AVG.
Expected evalMode=TRY post-rewrite, got ANSIDecimalPrecisionSuite+AggregateOptimizeSuite20/20 PASSdev/lint-scalacleanBackport candidates: branch-3.5 and branch-4.0 carry the 4 helper-ctor
sites (2 Aggregate + 2 Window) — verified by
git grep "aggregateFunction = (Sum|Average)\(UnscaledValue". The 2 widened-cast Aggregate sites aremaster-only (SPARK-56627). Backport patches would touch only the 4 shared
sites.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7