Skip to content

GH-50077: [C++][IPC] Avoid int64 overflow in ReadSparseCSXIndex#50038

Open
jmestwa-coder wants to merge 1 commit into
apache:mainfrom
jmestwa-coder:ipc-sparse-csx-index-overflow
Open

GH-50077: [C++][IPC] Avoid int64 overflow in ReadSparseCSXIndex#50038
jmestwa-coder wants to merge 1 commit into
apache:mainfrom
jmestwa-coder:ipc-sparse-csx-index-overflow

Conversation

@jmestwa-coder
Copy link
Copy Markdown

@jmestwa-coder jmestwa-coder commented May 25, 2026

Rationale for this change

ReadSparseCSXIndex validates the SparseTensor indices/indptr buffer sizes against the claimed shape using int64 products (non_zero_length * byte_width and (shape[axis] + 1) * byte_width). Both inputs come unchecked from the flatbuffer via GetSparseTensorMetadata, so a value near INT64_MAX overflows the signed product (UBSan-confirmed), wraps to a small value, and the size guard passes. The index Tensor is then built over a buffer smaller than its shape, enabling an out-of-bounds read.

What changes are included in this PR?

Compute the indices/indptr byte counts (and the shape[axis] + 1 term) with MultiplyWithOverflow/AddWithOverflow, the same checked helpers already used for this size math in tensor.cc, and return Status::Invalid when the computation overflows.

Are these changes tested?

Covered by the existing sparse tensor IPC round-trip tests; the change only adds an overflow guard on the existing validation path.

Are there any user-facing changes?

No.

@kou
Copy link
Copy Markdown
Member

kou commented May 25, 2026

@jmestwa-coder jmestwa-coder changed the title MINOR: [C++][IPC] Avoid int64 overflow in ReadSparseCSXIndex GH-50077: [C++][IPC] Avoid int64 overflow in ReadSparseCSXIndex Jun 2, 2026
@jmestwa-coder
Copy link
Copy Markdown
Author

Done. Opened #50077 and retitled the PR to reference it, and restored the PR template in the description.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

⚠️ GitHub issue #50077 has been automatically assigned in GitHub to PR creator.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the C++ IPC sparse tensor reader by preventing signed int64_t overflow when validating CSX (CSR/CSC) index buffer sizes in ReadSparseCSXIndex, closing an out-of-bounds read vector when parsing crafted IPC messages.

Changes:

  • Use internal::MultiplyWithOverflow to compute indices/indptr minimum byte sizes safely.
  • Use internal::AddWithOverflow to compute (shape[axis] + 1) safely for indptr length derivation.
  • Return Status::Invalid when overflow is detected during size validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -2336,26 +2337,40 @@ Result<std::shared_ptr<SparseIndex>> ReadSparseCSXIndex(
/*allow_short_read=*/false));

std::vector<int64_t> indices_shape({non_zero_length});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants