Describe the bug
Miri detects a Stacked Borrows violation in datafusion-datasource-parquet/src/opener.rs at line 1258:
self.decoder.push_ranges(ranges, data)?;
A Unique retag (mutable borrow) of the decoder is created, then invalidated by a SharedReadOnly retag at an .await point in the unfold-based stream. When the future resumes, the original Unique tag no longer exists in the borrow stack.
This was found while testing DataFusion 54.0 with Apache DataFusion Comet. Comet runs Miri in CI and caught this: CI run. DataFusion does not currently run Miri in its own CI.
To Reproduce
Run Miri against any test that exercises the PushDecoderStreamState parquet stream path. In our case it was triggered by a test that calls scan.execute(...).collect().await on a parquet scan.
Expected behavior
No undefined behavior under Miri's Stacked Borrows model.
Miri output
error: Undefined Behavior: trying to retag from <28128923> for SharedReadWrite permission at alloc8513101[0x8], but that tag does not exist in the borrow stack for this location
--> datafusion/datasource-parquet/src/opener.rs:1258:25
|
1258 | self.decoder.push_ranges(ranges, data)?;
| ^^^^^^^^^^^^ this error occurs as part of two-phase retag at alloc8513101[0x8..0x20]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <28128923> was created by a Unique retag at offsets [0x8..0x20]
help: <28128923> was later invalidated at offsets [0x0..0x138] by a SharedReadOnly retag
Additional context
The full backtrace points to PushDecoderStreamState::transition -> RowGroupsPrunedParquetOpen::build_stream closure -> futures::stream::Unfold poll. The aliasing violation occurs because the decoder is mutably re-borrowed across a yield point in the unfold stream.
Describe the bug
Miri detects a Stacked Borrows violation in
datafusion-datasource-parquet/src/opener.rsat line 1258:A
Uniqueretag (mutable borrow) of the decoder is created, then invalidated by aSharedReadOnlyretag at an.awaitpoint in theunfold-based stream. When the future resumes, the originalUniquetag no longer exists in the borrow stack.This was found while testing DataFusion 54.0 with Apache DataFusion Comet. Comet runs Miri in CI and caught this: CI run. DataFusion does not currently run Miri in its own CI.
To Reproduce
Run Miri against any test that exercises the
PushDecoderStreamStateparquet stream path. In our case it was triggered by a test that callsscan.execute(...).collect().awaiton a parquet scan.Expected behavior
No undefined behavior under Miri's Stacked Borrows model.
Miri output
Additional context
The full backtrace points to
PushDecoderStreamState::transition->RowGroupsPrunedParquetOpen::build_streamclosure ->futures::stream::Unfoldpoll. The aliasing violation occurs because the decoder is mutably re-borrowed across a yield point in theunfoldstream.