Skip to content

ORC-2165: [C++] Fix bounds check for LZO stop command trailer#2621

Closed
ffacs wants to merge 1 commit into
apache:mainfrom
ffacs:ORC-2165-lzo-stop-trailer
Closed

ORC-2165: [C++] Fix bounds check for LZO stop command trailer#2621
ffacs wants to merge 1 commit into
apache:mainfrom
ffacs:ORC-2165-lzo-stop-trailer

Conversation

@ffacs
Copy link
Copy Markdown
Contributor

@ffacs ffacs commented May 12, 2026

What changes were proposed in this pull request?

This PR fixes the C++ LZO decompressor stop command trailer validation. It now checks that two trailer bytes are available before reading them, and validates the trailer bytes
explicitly.

A regression test was added for truncated LZO stop command trailers.

Why are the changes needed?

Malformed LZO-compressed ORC input can end immediately after the LZO stop command, or with only one trailer byte remaining. The previous validation could read two bytes before safely
confirming that two bytes were available, causing an out-of-bounds read on truncated input.

The new check makes truncated LZO input fail cleanly with ParseError.

How was this patch tested?

Ran:

  cmake --build build --target orc-test -j 8
  build/c++/test/orc-test '--gtest_filter=TestDecompression.testLzo*'

The LZO decompression tests passed.

Also ran a minimal AddressSanitizer harness against truncated LZO stop command inputs and confirmed there was no ASan report.

Was this patch authored or co-authored using generative AI tooling?

Yes. Generated with OpenAI Codex.

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @ffacs and @wgtmac .

dongjoon-hyun pushed a commit that referenced this pull request May 14, 2026
### What changes were proposed in this pull request?

  This PR fixes the C++ LZO decompressor stop command trailer validation. It now checks that two trailer bytes are available before reading them, and validates the trailer bytes
  explicitly.

  A regression test was added for truncated LZO stop command trailers.

### Why are the changes needed?

  Malformed LZO-compressed ORC input can end immediately after the LZO stop command, or with only one trailer byte remaining. The previous validation could read two bytes before safely
  confirming that two bytes were available, causing an out-of-bounds read on truncated input.

  The new check makes truncated LZO input fail cleanly with `ParseError`.

### How was this patch tested?

  Ran:

```bash
  cmake --build build --target orc-test -j 8
  build/c++/test/orc-test '--gtest_filter=TestDecompression.testLzo*'
```
  The LZO decompression tests passed.

  Also ran a minimal AddressSanitizer harness against truncated LZO stop command inputs and confirmed there was no ASan report.

### Was this patch authored or co-authored using generative AI tooling?

  Yes. Generated with OpenAI Codex.

Closes #2621 from ffacs/ORC-2165-lzo-stop-trailer.

Authored-by: ffacs <ffacs520@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 473ac95)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Copy Markdown
Member

Merged to main/2.3 for Apache ORC 2.3.1.

@dongjoon-hyun dongjoon-hyun added this to the 2.3.1 milestone May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants