Skip to content

[Variant] extend shredded null handling for arrays#9599

Open
sdf-jkl wants to merge 13 commits intoapache:mainfrom
sdf-jkl:shredded-null-handling
Open

[Variant] extend shredded null handling for arrays#9599
sdf-jkl wants to merge 13 commits intoapache:mainfrom
sdf-jkl:shredded-null-handling

Conversation

@sdf-jkl
Copy link
Copy Markdown
Contributor

@sdf-jkl sdf-jkl commented Mar 21, 2026

Which issue does this PR close?

Rationale for this change

Check issue

What changes are included in this PR?

  • Added AppendNullMode enum supporting all semantics.
  • Replaced the bool logic to the new enum
  • Fix test outputs for List Array cases

Are these changes tested?

  • Added unit tests

Are there any user-facing changes?

@github-actions github-actions bot added the parquet-variant parquet-variant* crates label Mar 21, 2026
@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 21, 2026

@scovich @klion26 @alamb 👀

Variant::List(list) => {
for element in list.iter() {
self.element_builder.append_value(element)?;
match element {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to change this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Append value would miss the shred_variant Null-handling semantics without it.

VariantToListArrowRowBuilder builds the Arrow typed array in an interesting way. It internally stores a StructArray with both value and typed_value fields, so both are kept.

Copy link
Copy Markdown
Contributor

@scovich scovich Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand.

We already know we're dealing with an array element, so what goes wrong if we just write Variant::Null? That would append the value and mark it as non-null, which seems to be exactly what append_to_null_row does for ANM::ArrayElement case?

        match self {
            Self::TopLevelVariant => nulls.append_null(),
=>          Self::ObjectField | Self::ArrayElement => nulls.append_non_null(),
        }
        match self {
            Self::TopLevelVariant | Self::ObjectField => value_builder.append_null(),
=>          Self::ArrayElement => value_builder.append_value(Variant::Null),
        }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It almost seems like there's actually no need for the third enum variant because an array element builder should never experience an append_null call in the first place? If so, we only could rely on that structural invariant to handle variant arrays, and we only need a boolean to control whether append_null causes nulls.append_null (top-level) vs. nulls.append_non_null (object field), with an unconditional value_builder.append_null()?

But I wonder what the actual call stack/object hierarchy is in these various cases? Maybe I'm misunderstanding the layout and how the various builders relate to each other.

Copy link
Copy Markdown
Contributor Author

@sdf-jkl sdf-jkl Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test showing an error with using just Variant::Null.

If we use variant_get(..., List<Int64>) and cast_options(safe: False) // strict casting, appended Variant::Null can't be safely cast to the List's inner type causing an Err.

By using append_null we avoid this error.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't file one yet. I didn't realize that this was a problem existing code until you pointed it out.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we need to distinguish whether the element is Variant::Null or not because Variant::Null can't be cast to any type. Will it be better if we handle this logic in the following code (test whether the passed in is Variant::Null)

fn append_value(&mut self, $value: &Variant<'_, '_>) -> Result<bool> {
if let Some(v) = $value_transform {
self.builder.append_value(v);
Ok(true)
} else {
if !self.cast_options.safe {
// Unsafe casting: return error on conversion failure
return Err(ArrowError::CastError(format!(
"Failed to extract primitive of type {} from variant {:?} at path VariantPath([])",
$type_name,
$value
)));
}
// Safe casting: append null on conversion failure
self.builder.append_null();
Ok(false)

Copy link
Copy Markdown
Contributor

@scovich scovich Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I glanced at that same code yesterday and reached a similar conclusion. We either need to teach the $transform to be a tri-state (e.g. Result<Option<...>>) or the macro needs to handle Variant::Null separately, bypassing the $transform entirely. I don't know which one is better, would need to stare more at the code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here 7526ab0

typed_value_builder: PrimitiveVariantToArrowRowBuilder<'a>,
nulls: NullBufferBuilder,
top_level: bool,
null_mode: AppendNullMode,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a reasonable solution to move nulls into null_mode?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate, please?

Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any suitable sample data from other variant-wielding engines that we could cross-check this change against? This stuff is super subtle and I'm nervous code review alone might not catch all potential issues. Even if this PR exactly replicates what the bug report says is the correct behavior, there's a non-zero chance the bug report is wrong, and/or that other engines have chosen to interpret the variant spec differently. If other engines do differ, that would be good to know ASAP, regardless of who's "right."

Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I try to refresh my memories of how this is all supposed to work, I realize there aren't many unit tests for the top-level and object field cases -- at least not visible in the PR diff. Even tho there is a behavior change for top-level field handling.

Do we have test gaps here?

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 24, 2026

As I try to refresh my memories of how this is all supposed to work, I realize there aren't many unit tests for the top-level and object field cases -- at least not visible in the PR diff. Even tho there is a behavior change for top-level field handling.

Do we have test gaps here?

The only behavior change in this PR is TopLevelVariant handling in VariantToShreddedArrayVariantRowBuilder::append_null.

I added a focused unit test to cover that path (test_variant_get_list_like_unsafe_cast_preserves_null_elements).

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 24, 2026

Need #9612 to pass the tests

Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either I'm confusing myself badly (very possible) or there's a serious type mismatch here. See comments.

Variant::List(list) => {
for element in list.iter() {
self.element_builder.append_value(element)?;
match element {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First: That example sounds wrong? Shouldn't a variant_get that hits Variant::Null always return NULL?
Variant::Null coerces to a NULL of any type, no? (unlike, say encountering Variant::Float when List<Int64> was requested). Maybe this is a special case symptom of #9606?

Second: But anyway, this specific code is unpacking list elements not the list itself?
What specific variant value and json path did you hit problems for?

Third: This code that unpacks list elements used to just append_value(value) (without ever knowing value = Variant::Null), which would produce a non-NULL value column entry containing Variant::Null. The new code specifically checks for Variant::Null and calls append_null(). But the element builder was initialized in ArrayElement mode, so append_null does the exact same thing -- produces a non-NULL value column entry containing Variant::Null. Based on that, I'm struggling to see how this change would alter behavior at all?

Variant::List(list) => {
for element in list.iter() {
self.element_builder.append_value(element)?;
match element {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait... this is going from shredded variant to some kind of ListArray, in response to variant_get(..., List<Int64>). That operation shouldn't produce shredded variant as output, so we shouldn't need to constrict it with a NullValue in the first place?? It should produce an actual list.

If we want to extract a shredded list as output, we'd need to pass variant_get(..., List<Struct>) where the list is annotated as variant extension type and the struct has value and typed_value fields of the appropriate types?

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 24, 2026

First: That example sounds wrong? Shouldn't a variant_get that hits Variant::Null always return NULL?
Variant::Null coerces to a NULL of any type, no? (unlike, say encountering Variant::Float when List was requested).

Not in strict casting. In strict casting any cast that returns None for non-Null target types errors out.

Maybe we could add Variant::Null to other types cast support #9563, since it should be just None. OR add a special case for input Nulls.

Second: But anyway, this specific code is unpacking list elements not the list itself?
What specific variant value and json path did you hit problems for?

That list would error on trying to cast Null list element to Int64.

Third: This code that unpacks list elements used to just append_value(value) (without ever knowing value = Variant::Null), which would produce a non-NULL value column entry containing Variant::Null. The new code specifically checks for Variant::Null and calls append_null(). But the element builder was initialized in ArrayElement mode, so append_null does the exact same thing -- produces a non-NULL value column entry containing Variant::Null. Based on that, I'm struggling to see how this change would alter behavior at all?

It does end up appending the same Variant::Null, but it skips casting in append_value. That casting to a non-Null type from Null is what causing the error.


Oh wait... this is going from shredded variant to some kind of ListArray, in response to variant_get(..., List). That operation shouldn't produce shredded variant as output, so we shouldn't need to constrict it with a NullValue in the first place?? It should produce an actual list.

If we want to extract a shredded list as output, we'd need to pass variant_get(..., List) where the list is annotated as variant extension type and the struct has value and typed_value fields of the appropriate types?

Earlier I mentioned that:

VariantToListArrowRowBuilder builds the Arrow typed array in an interesting way. It internally stores a StructArray with both value and typed_value fields, so both are kept.

I think that currently its sole purpose is to be used by variant_get, but it's not useful to convert a VariantArray to Arrow.

If we want to extract a shredded list as output, we'd need to pass variant_get(..., List) where the list is annotated as variant extension type and the struct has value and typed_value fields of the appropriate types?

I think you're right and it should be a separate behavior of that variant_to_arrow kernel.

@scovich
Copy link
Copy Markdown
Contributor

scovich commented Mar 24, 2026

Second: But anyway, this specific code is unpacking list elements not the list itself?
What specific variant value and json path did you hit problems for?

That list would error on trying to cast Null list element to Int64.

Variant::Null should coerce to any nullable type (as NULL). Just like JSON null coerces to any nullable type (as NULL) during parsing.

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 24, 2026

Right, but the way safe casting is implemented in variant_to_arrow returns an Err when a cast yields None. That’s what’s causing the issue.

fn append_value(&mut self, $value: &Variant<'_, '_>) -> Result<bool> {
if let Some(v) = $value_transform {
self.builder.append_value(v);
Ok(true)
} else {
if !self.cast_options.safe {
// Unsafe casting: return error on conversion failure
return Err(ArrowError::CastError(format!(
"Failed to extract primitive of type {} from variant {:?} at path VariantPath([])",
$type_name,
$value
)));
}
// Safe casting: append null on conversion failure
self.builder.append_null();
Ok(false)

@scovich
Copy link
Copy Markdown
Contributor

scovich commented Mar 24, 2026

Right, but the way safe casting is implemented in variant_to_arrow returns an Err when a cast yields None. That’s what’s causing the issue.

fn append_value(&mut self, $value: &Variant<'_, '_>) -> Result<bool> {
if let Some(v) = $value_transform {
self.builder.append_value(v);
Ok(true)
} else {
if !self.cast_options.safe {
// Unsafe casting: return error on conversion failure
return Err(ArrowError::CastError(format!(
"Failed to extract primitive of type {} from variant {:?} at path VariantPath([])",
$type_name,
$value
)));
}
// Safe casting: append null on conversion failure
self.builder.append_null();
Ok(false)

Ah, that's awkward. Doesn't distinguish None due to Variant::Null vs. None due to incompatible other type...

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 25, 2026

@scovich fixed it here - d63bb38 and removed that match arm as it's now redundant.

Copy link
Copy Markdown
Member

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 27, 2026

Actually, I think we can make it even cleaner than treat_invalid_as_null.

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 27, 2026

@scovich check this caa45dd!

Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current code is correct. Some thoughts about readability.

Comment on lines +706 to +711
const NULL_VALUES: [NullValue; 3] = [
NullValue::TopLevelVariant,
NullValue::ObjectField,
NullValue::ArrayElement,
];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arrow-avro uses strum_macros crate, whose EnumIter seems relevant? Are we willing to take a dep on that crate tho?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks unnecessary for extra 5loc in tests mod.

Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Variant] extend shredded null handling for arrays

3 participants