Skip to content

[SPARK-56949][SQL] DecimalAggregates drops evalMode when rewriting Sum/Average#56009

Closed
yaooqinn wants to merge 1 commit into
apache:masterfrom
yaooqinn:users/kentyao/decimal-aggregate-evalmode-preserve
Closed

[SPARK-56949][SQL] DecimalAggregates drops evalMode when rewriting Sum/Average#56009
yaooqinn wants to merge 1 commit into
apache:masterfrom
yaooqinn:users/kentyao/decimal-aggregate-evalmode-preserve

Conversation

@yaooqinn
Copy link
Copy Markdown
Member

@yaooqinn yaooqinn commented May 20, 2026

What changes were proposed in this pull request?

In Optimizer.DecimalAggregates, rewrite the 6 Sum / Average rewrites
from the single-arg helper ctor (Sum(child) / Average(child)) to
original.copy(child = UnscaledValue(...)), so sibling fields like
evalMode are preserved instead of re-read from SQLConf. The 6 sites
are 4 Aggregate arms (SUM/AVG x un-widened/widened peel) plus 2 Window
arms (un-widened SUM and AVG); both Window arms are directly reachable
because the input is a bare AttributeReference, not a hoisted
composite Cast.

The helper ctors themselves are retained — used by other call sites.

Why are the changes needed?

Latent bug since SPARK-38432 added evalMode to Sum / Average: the
helper ctor re-reads EvalMode from SQLConf, silently dropping
EvalMode.TRY from try_sum / try_avg. Sum.shouldTrackIsEmpty = evalMode == TRY then drops the isEmpty aggregation buffer column, so
overflow no longer returns NULL.

Does this PR introduce any user-facing change?

Yes. try_sum / try_avg on narrow-precision decimals now preserve
EvalMode.TRY through the optimizer rewrite, so overflow returns NULL
as documented instead of wrapping (ansi=false) or throwing
(ansi=true). Affected surface: SUM precision <= 8, AVG precision <= 7
(both Aggregate and Window arms).

This is observable directly in the optimized plan — the rewritten Sum /
Average node carries evalMode = TRY post-fix vs evalMode = ANSI (the
SQLConf default, re-read by the helper ctor) pre-fix.

How was this patch tested?

6 new unit tests in DecimalAggregatesSuite asserting evalMode == TRY
post-rewrite for: un-widened Aggregate SUM, un-widened Aggregate AVG,
widened-cast Aggregate SUM, widened-cast Aggregate AVG, un-widened
Window SUM, un-widened Window AVG.

  • Pre-fix: 6/6 FAIL with Expected evalMode=TRY post-rewrite, got ANSI
  • Post-fix: 37/37 PASS (6 new + 31 existing, 0 regression)
  • Sibling regressions clean: DecimalPrecisionSuite +
    AggregateOptimizeSuite 20/20 PASS
  • dev/lint-scala clean

Backport candidates: branch-3.5 and branch-4.0 carry the 4 helper-ctor
sites (2 Aggregate + 2 Window) — verified by git grep "aggregateFunction = (Sum|Average)\(UnscaledValue". The 2 widened-cast Aggregate sites are
master-only (SPARK-56627). Backport patches would touch only the 4 shared
sites.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

}
case ae @ AggregateExpression(af, _, _, _, _) => af match {
case Sum(WidenedDecimalChild(inner, p, pPrime, s), _)
case s @ Sum(WidenedDecimalChild(inner, p, pPrime, s_scale), _)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this bug affect the above case Sum(e @ DecimalExpression(prec, scale), _) and case Average(e @ DecimalExpression(prec, scale), _) branches?

@peter-toth
Copy link
Copy Markdown
Contributor

@yaooqinn, I think the issue is real, but is the example in description valid?

scala> sql("SELECT try_sum(cast(id as decimal(7,2))) FROM range(10)").show
+---------------------------------+
|try_sum(CAST(id AS DECIMAL(7,2)))|
+---------------------------------+
|                            45.00|
+---------------------------------+

@yaooqinn yaooqinn force-pushed the users/kentyao/decimal-aggregate-evalmode-preserve branch from 02cc498 to b95abfb Compare May 20, 2026 12:23
@yaooqinn
Copy link
Copy Markdown
Member Author

yaooqinn commented May 20, 2026

Thanks @peter-toth. Two updates:

  1. Description: dropped the misleading 10-row overflow example; rewrote
    to frame the change as evalMode preserved through the rewrite
    (observable on the optimized plan).

  2. Scope: expanded 4 → 6 sites. The un-widened Window arms at
    L2589 / L2594 carry the same helper-ctor bug and are directly
    reachable (bare AttributeReference, no ExtractWindowExpressions hoist).
    Added 2 tests; pre-fix both fail with got ANSI, post-fix 37/37 pass.

Force-pushed to b95abfb.

@yaooqinn yaooqinn closed this in 9732587 May 20, 2026
yaooqinn added a commit that referenced this pull request May 20, 2026
…m/Average

### What changes were proposed in this pull request?

In `Optimizer.DecimalAggregates`, rewrite the 6 `Sum` / `Average` rewrites
from the single-arg helper ctor (`Sum(child)` / `Average(child)`) to
`original.copy(child = UnscaledValue(...))`, so sibling fields like
`evalMode` are preserved instead of re-read from `SQLConf`. The 6 sites
are 4 Aggregate arms (SUM/AVG x un-widened/widened peel) plus 2 Window
arms (un-widened SUM and AVG); both Window arms are directly reachable
because the input is a bare `AttributeReference`, not a hoisted
composite Cast.

The helper ctors themselves are retained — used by other call sites.

### Why are the changes needed?

Latent bug since SPARK-38432 added `evalMode` to `Sum` / `Average`: the
helper ctor re-reads `EvalMode` from `SQLConf`, silently dropping
`EvalMode.TRY` from `try_sum` / `try_avg`. `Sum.shouldTrackIsEmpty =
evalMode == TRY` then drops the `isEmpty` aggregation buffer column, so
overflow no longer returns NULL.

### Does this PR introduce _any_ user-facing change?

Yes. `try_sum` / `try_avg` on narrow-precision decimals now preserve
`EvalMode.TRY` through the optimizer rewrite, so overflow returns NULL
as documented instead of wrapping (`ansi=false`) or throwing
(`ansi=true`). Affected surface: SUM `precision <= 8`, AVG `precision <= 7`
(both Aggregate and Window arms).

This is observable directly in the optimized plan — the rewritten Sum /
Average node carries `evalMode = TRY` post-fix vs `evalMode = ANSI` (the
SQLConf default, re-read by the helper ctor) pre-fix.

### How was this patch tested?

6 new unit tests in `DecimalAggregatesSuite` asserting `evalMode == TRY`
post-rewrite for: un-widened Aggregate SUM, un-widened Aggregate AVG,
widened-cast Aggregate SUM, widened-cast Aggregate AVG, un-widened
Window SUM, un-widened Window AVG.

- Pre-fix: 6/6 FAIL with `Expected evalMode=TRY post-rewrite, got ANSI`
- Post-fix: 37/37 PASS (6 new + 31 existing, 0 regression)
- Sibling regressions clean: `DecimalPrecisionSuite` +
  `AggregateOptimizeSuite` 20/20 PASS
- `dev/lint-scala` clean

**Backport candidates**: branch-3.5 and branch-4.0 carry the 4 helper-ctor
sites (2 Aggregate + 2 Window) — verified by `git grep "aggregateFunction =
(Sum|Average)\(UnscaledValue"`. The 2 widened-cast Aggregate sites are
master-only (SPARK-56627). Backport patches would touch only the 4 shared
sites.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

Closes #56009 from yaooqinn/users/kentyao/decimal-aggregate-evalmode-preserve.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <kentyao@microsoft.com>
(cherry picked from commit 9732587)
Signed-off-by: Kent Yao <kentyao@microsoft.com>
@yaooqinn
Copy link
Copy Markdown
Member Author

Merged to master and 4.x, thank you @peter-toth

@yaooqinn yaooqinn deleted the users/kentyao/decimal-aggregate-evalmode-preserve branch May 20, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants