[SPARK-56892][SQL] Bulk read optimization for Parquet DELTA_BINARY_PACKED decoding by iemejia · Pull Request #55919 · apache/spark

iemejia · 2026-05-16T19:42:32Z

What changes were proposed in this pull request?

Replace per-element lambda dispatch in readIntegers/readLongs with bulk paths that compute prefix sums in-place over the unpacked delta buffer and write via putInts/putLongs (backed by System.arraycopy on-heap).

Three optimizations in this PR:

Bulk read for INT32/INT64: readBulkIntegers and readBulkLongs replace the generic readValues() lambda-per-value path. A single loadMiniBlockBulk method handles block/mini-block loading, prefix-sum computation, and delegates the type-specific write to a BulkWriter callback (called once per mini-block, not per value).
Zero-allocation unsigned long encoding: Replace new BigInteger(Long.toUnsignedString(v)).toByteArray() (3 allocations per value: String + BigInteger + byte[]) with ByteBuffer.putLong into a reusable scratch buffer. The shared utility encodeUnsignedLongBigEndian is extracted into VectorizedReaderBase and applied to all call sites (VectorizedDeltaBinaryPackedReader, UnsignedLongUpdater, ParquetDictionary).
Benchmark fix: Add unsignedLongVec.reset() before readUnsignedLongs to prevent unbounded arrayData() growth across benchmark iterations (OOM).

Why are the changes needed?

The DELTA_BINARY_PACKED decoder was 2-5x slower than PLAIN encoding for INT32/INT64 reads due to per-element lambda dispatch and lack of bulk vector writes. The readUnsignedLongs path allocated 3 objects per value (12,288 allocations per 4096-row batch) due to BigInteger(Long.toUnsignedString(v)).

Benchmark results on the same machine (AMD EPYC 9V45, OpenJDK 25.0.3+9-LTS):

Benchmark	Baseline (M/s)	After (M/s)	Speedup
INT32 readIntegers, monotonic	644	873	1.4x
INT32 readIntegers, small-delta	466	553	1.2x
INT32 readIntegers, wide random	357	417	1.2x
INT64 readLongs, constant	316	879	2.8x
INT64 readLongs, monotonic	252	951	3.8x
INT64 readLongs, small-delta	216	587	2.7x
INT64 readLongs, wide random	163	313	1.9x
readUnsignedLongs	9.2	66	7.2x

Does this PR introduce any user-facing change?

No. This is a performance improvement to internal Parquet decoding. No API or behavior changes.

How was this patch tested?

Existing unit tests: ParquetDeltaEncodingInteger (13 tests), ParquetDeltaEncodingLong (13 tests), ParquetDeltaByteArrayEncodingSuite, ParquetDeltaLengthByteArrayEncodingSuite, ParquetVectorizedSuite (25 tests), ParquetIOSuite (unsigned Parquet logical types test) -- all pass.
Benchmark: VectorizedDeltaReaderBenchmark run before and after on the same machine with changes stashed/unstashed for fair comparison.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: OpenCode (Claude claude-opus-4.6)

…CKED decoding Replace per-element lambda dispatch in readIntegers/readLongs with bulk paths that compute prefix sums in-place over the unpacked delta buffer and write via putInts/putLongs (backed by System.arraycopy on-heap). Also optimize readUnsignedLongs by replacing BigInteger(Long.toUnsignedString(v)).toByteArray() with zero-allocation manual byte encoding using ByteBuffer.putLong. Extract the shared utility encodeUnsignedLongBigEndian into VectorizedReaderBase and apply it to all call sites (UnsignedLongUpdater, ParquetDictionary). Fix benchmark OOM: add unsignedLongVec.reset() before readUnsignedLongs to prevent unbounded arrayData growth across iterations. Same-machine results (AMD EPYC 9V45): - INT32 readIntegers: 1.2-1.4x faster (monotonic/delta/wide patterns) - INT64 readLongs: 1.9-3.8x faster (all patterns) - readUnsignedLongs: 7.8x faster

iemejia force-pushed the SPARK-56892-delta-binary-packed-bulk-read branch from 9d8eb10 to 04a4f8e Compare May 16, 2026 22:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56892][SQL] Bulk read optimization for Parquet DELTA_BINARY_PACKED decoding#55919

[SPARK-56892][SQL] Bulk read optimization for Parquet DELTA_BINARY_PACKED decoding#55919
iemejia wants to merge 1 commit into
apache:masterfrom
iemejia:SPARK-56892-delta-binary-packed-bulk-read

iemejia commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iemejia commented May 16, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant