Skip to content

GH-22342: [C++] [Documentation] add discussion of Union.typeIds to Layout.rst#50097

Draft
thisisnic wants to merge 1 commit into
apache:mainfrom
thisisnic:GH-22342-docs-uniontype
Draft

GH-22342: [C++] [Documentation] add discussion of Union.typeIds to Layout.rst#50097
thisisnic wants to merge 1 commit into
apache:mainfrom
thisisnic:GH-22342-docs-uniontype

Conversation

@thisisnic
Copy link
Copy Markdown
Member

@thisisnic thisisnic commented Jun 4, 2026

Rationale for this change

An issue was

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

This PR includes breaking changes to public APIs. (If there are any breaking changes to public APIs, please explain which changes are breaking. If not, you can remove this.)

This PR contains a "Critical Fix". (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.)

Copilot AI review requested due to automatic review settings June 4, 2026 09:48
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

⚠️ GitHub issue #22342 has been automatically assigned in GitHub to PR creator.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Arrow format documentation to clarify how union type ids relate to union children and the on-wire “types buffer”, improving reader understanding of cases where type ids don’t match child array indices.

Changes:

  • Adds an explicit explanation that union child types have assigned type ids which may differ from child array indices.
  • Updates the dense union “Types buffer” description to reference the above mapping and avoid implying ids must be contiguous/zero-based.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +867 to +871
Each child type in a union has a type id (an 8-bit signed integer)
that identifies it. These type ids are not necessarily the same as the
index of the corresponding child array. For example, a union of two types
might assign type ids 5 and 7 rather than 0 and 1. The mapping from type
ids to child arrays is part of the union type definition.
Comment on lines +887 to +890
* Types buffer: A buffer of 8-bit signed integers, indicating the type
id of each slot. Note that these type ids are not necessarily the
same as the child array index (see above). A union with more than 128
possible types can be modeled as a union of unions.
union can have a value chosen from these types. The types are named
like a struct's fields, and the names are part of the type metadata.

Each child type in a union has a type id (an 8-bit signed integer)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants