Skip to content

[SPARK-56935][SQL] Simplify GetArrayItem codegen and consolidate ElementAtUtils into ArrayExpressionUtils#55973

Open
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:SPARK-56935-getarrayitem
Open

[SPARK-56935][SQL] Simplify GetArrayItem codegen and consolidate ElementAtUtils into ArrayExpressionUtils#55973
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:SPARK-56935-getarrayitem

Conversation

@gengliangwang
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

Two related changes:

  1. Fold ElementAtUtils.resolveArrayIndex into the existing ArrayExpressionUtils.java, and remove ElementAtUtils.java. The per-expression naming chosen in SPARK-56916 didn't match the codebase's category-scoped utility-class convention (ArrayExpressionUtils, BitmapExpressionUtils, ExpressionImplUtils, ...) and there's now a natural home for any future array-expression ANSI helper.

  2. Refactor GetArrayItem's ANSI codegen + eval paths to use a new ArrayExpressionUtils.checkArrayIndex(int length, int index, QueryContext context) helper, mirroring how ElementAt uses resolveArrayIndex. The helper throws invalidArrayIndexError for negative / out-of-bound ANSI indices and returns the validated 0-based position so the caller chains into arr.get(idx, dataType). The non-ANSI branch keeps its inline form because it must return null (not throw) on out-of-bound.

Net effect: the existing per-expression ElementAtUtils.java is removed; the existing ArrayExpressionUtils.java grows two *ArrayIndex helpers used by ElementAt and GetArrayItem codegen + eval.

Why are the changes needed?

Part of SPARK-56908 (umbrella). arr[idx] and element_at(arr, idx) share the same ANSI out-of-bound error shape; collapsing both into one-line helper calls keeps the codegen size small and avoids maintaining two parallel inline forms.

Does this PR introduce any user-facing change?

No. The compiled behavior is identical; only the emitted Java source text changes.

How was this patch tested?

build/sbt "catalyst/testOnly *ComplexTypeSuite *CollectionExpressionsSuite"
build/sbt "sql/testOnly *QueryExecutionAnsiErrorsSuite"

All pass (83/83 catalyst, 21/21 sql).

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x

…entAtUtils into ArrayExpressionUtils

### What changes were proposed in this pull request?

Two related changes:

1. Fold `ElementAtUtils.resolveArrayIndex` into the existing
   `ArrayExpressionUtils.java`, and remove `ElementAtUtils.java`.
   The per-expression naming chosen in SPARK-56916 didn't match the
   codebase's category-scoped utility-class convention
   (`ArrayExpressionUtils`, `BitmapExpressionUtils`,
   `ExpressionImplUtils`, ...) and there's now a natural home for any
   future array-expression ANSI helper.

2. Refactor `GetArrayItem`'s ANSI codegen + eval paths to use a new
   `ArrayExpressionUtils.checkArrayIndex(int length, int index,
   QueryContext context)` helper, mirroring how `ElementAt` uses
   `resolveArrayIndex`. The helper throws
   `invalidArrayIndexError` for negative / out-of-bound ANSI indices
   and returns the validated 0-based position so the caller chains
   into `arr.get(idx, dataType)`. The non-ANSI branch keeps its inline
   form because it must return `null` (not throw) on out-of-bound.

Net effect: the existing per-expression `ElementAtUtils.java` is
removed; the existing `ArrayExpressionUtils.java` grows two
`*ArrayIndex` helpers used by `ElementAt` and `GetArrayItem` codegen +
eval.

### Why are the changes needed?

Part of SPARK-56908 (umbrella). `arr[idx]` and `element_at(arr, idx)`
share the same ANSI out-of-bound error shape; collapsing both into
one-line helper calls keeps the codegen size small and avoids
maintaining two parallel inline forms.

### Does this PR introduce _any_ user-facing change?

No. The compiled behavior is identical; only the emitted Java source
text changes.

### How was this patch tested?

```
build/sbt "catalyst/testOnly *ComplexTypeSuite *CollectionExpressionsSuite"
build/sbt "sql/testOnly *QueryExecutionAnsiErrorsSuite"
```

All pass (83/83 catalyst, 21/21 sql).

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x
@gengliangwang gengliangwang requested review from cloud-fan and viirya May 19, 2026 05:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant