[core] Fix OOM when writing/compacting table with large records by yugan95 · Pull Request #7621 · apache/paimon

yugan95 · 2026-04-10T03:18:25Z

Purpose

Linked issue: close #7620
Fix OOM when writing table with large records (100MB+) and many buckets (e.g. 256) due to unbounded buffer growth in sort, merge and compaction paths. Each bucket's writer independently holds its own sort buffer, merge channels, and compaction readers. When a large record inflates an internal reuse buffer, that bloated buffer is retained per-bucket, causing memory usage to quickly exceed available heap.

Heap dump analysis identified four independent root causes:

1. Sort path — RowHelper internal buffer never shrinks

RowHelper.reuseWriter grows its internal MemorySegment list for large records, but BinaryRowWriter.reset() only resets the cursor without releasing oversized segments. Additionally, InternalRowSerializer.serialize() can exit via EOFException (a normal signal when the sort buffer is full), skipping any cleanup of the bloated buffer.

2. Merge path — BinaryRowSerializer.deserialize(reuse) only grows, never shrinks

Each merge channel holds a BinaryRow reuse instance. When a large record is deserialized, the backing MemorySegment grows to fit it but is never shrunk for subsequent small records. With max-num-file-handles (default 128) channels each retaining a 100MB+ buffer, memory usage explodes.

3. Compaction read path — HeapBytesVector.reserveBytes() integer overflow

reserveBytes() computes newCapacity * 2 using plain multiplication. When newCapacity exceeds ~1.07 billion bytes, this overflows Integer.MAX_VALUE, causing NegativeArraySizeException or silent data corruption.

4. Parquet write — statistics and page-size-check config not passed through

RowDataParquetBuilder does not pass through parquet.statistics.truncate.length, parquet.columnindex.truncate.length, parquet.page.size.row.check.min, and parquet.page.size.row.check.max. Without these, users cannot tune Parquet behavior for large-record scenarios, leading to multi-GB pages and bloated footers.

Changes

RowHelper: add resetIfTooLarge() — release internal buffer when segments exceed 4MB
InternalRowSerializer: call resetIfTooLarge() in finally block of serialize() and serializeToPages() to handle EOFException exit path
BinaryRowSerializer: add shrink logic in deserialize(reuse) — reallocate when existing buffer > 4MB threshold
HeapBytesVector: use bit-shift (<< 1) instead of * 2, cap at MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8, throw clear error on overflow
RowDataParquetBuilder: pass through statistics.truncate.length, columnindex.truncate.length, min-row-count-for-page-size-check, max-row-count-for-page-size-check from config

Tests

RowHelperTest — validates resetIfTooLarge() releases oversized buffers (> 4MB) and preserves small ones
BinaryRowSerializerShrinkTest — validates deserialize(reuse) shrinks oversized buffers and preserves small ones
HeapBytesVectorReserveBytesTest — validates overflow-safe reserveBytes() growth and data correctness

API and Format

N/A — no public API or format changes.

Documentation

N/A

[core] fix OOM when writing/compacting table with large records

a15a89e

yugan95 force-pushed the record-0410 branch from 240731e to a15a89e Compare April 10, 2026 03:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Fix OOM when writing/compacting table with large records#7621

[core] Fix OOM when writing/compacting table with large records#7621
yugan95 wants to merge 1 commit intoapache:masterfrom
yugan95:record-0410

yugan95 commented Apr 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yugan95 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Changes

Tests

API and Format

Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yugan95 commented Apr 10, 2026 •

edited

Loading