[improvement](query) Align olap scan schema with storage keys#64413
[improvement](query) Align olap scan schema with storage keys#64413BiteTheDDDDt wants to merge 17 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: Storage expression pushdown previously rebound slot refs to the expanded reader schema by mutating expression column ids inside segment iterators. That violates the assumption that pushed-down expressions are immutable and can produce surprising behavior when the same expression objects are reused. This change keeps the full scan schema unchanged, adds a project schema aligned with the final projected columns, and evaluates pushed-down expressions against a temporary project block. Slot ordinal to tablet column id mapping now uses the project schema, so no expression mutation is required.
### Release note
None
### Check List (For Author)
- Test: Manual test
- ./build.sh --be
- Behavior changed: No
- Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#64010 Problem Summary: Address review comments for avoiding storage expression column id rebinding. The project-schema column-id to block-position map was initialized in multiple places, and project-block construction still ran on ordinary scan paths where the project schema matched the reader schema. Virtual column materialization also rebuilt the project block for each expression. This change centralizes schema block mapping initialization, reuses the original block when the project schema is identical to the reader schema, reuses the projected block during virtual column materialization, and keeps expression exceptions converted to Status at the virtual-column evaluation boundary. ### Release note None ### Check List (For Author) - Test: Unit Test - ./build.sh --be - build-support/run-clang-tidy.sh attempted; it reported pre-existing/analysis diagnostics in segment_iterator.cpp, segment_iterator.h, and be/src/util/jni-util.h unrelated to the changed logic. - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: N/A Related PR: apache#64010 Problem Summary: Review feedback pointed out that the segment iterator should work from an explicit project schema, and that the project block helper should fill caller-owned storage instead of returning a selected block pointer. Follow-up coverage showed that direct RowsetReader and Segment::new_iterator callers do not pass the TabletReader-expanded project schema fields. The reader now keeps the SegmentIterator project_columns invariant, while BetaRowsetReader maps direct callers to return_columns and Segment::new_iterator maps direct callers to schema column IDs. The project block helper remains caller-owned and the ANN source column ordinal translation is documented. ### Release note None ### Check List (For Author) - Test: Unit Test - build-support/clang-format.sh - git diff --check - ./run-be-ut.sh --run --filter=TestDeltaWriterClusterKey.vec_sequence_col - Behavior changed: No - Does this need documentation: No
Issue Number: N/A
Related PR: N/A
Problem Summary: Avoid maintaining separate execution and storage schemas in the BE segment iterator for pushed-down storage expressions. The planner expands AGG-key and non-MOW unique-key olap scan tuples with missing storage key columns and marks the extra key slots in thrift. The scan node projects back to the original output tuple, while BE maps the marked slots to tablet column ids and, in direct reader mode, fills those storage-only key columns without reading them. Merge and aggregation reader paths still read real key columns.
None
- Test: Manual test
- ./build.sh --fe
- ./build.sh --be
- git diff --check
- build-support/clang-format.sh
- build-support/run-clang-tidy.sh --build-dir be/build_Release --base 23e21f4 (attempted; failed on existing file-level clang-tidy diagnostics before producing a clean changed-line gate)
- Behavior changed: No
- Does this need documentation: No
b0ca37e to
bbdb261
Compare
### What problem does this PR solve? Issue Number: None Related PR: apache#64413 Problem Summary: SegmentIterator no longer needs to maintain a second Schema object for the expression slot layout after the FE-side scan schema is expanded to match storage keys. This change removes the project schema state and replaces it with a lightweight expression ordinal to storage column id mapping. The mapping still preserves the old FE compatibility path when TabletReader expands return columns before segment iteration, while the common aligned case is normalized to the storage schema pointer so per-batch checks stay O(1). ### Release note None ### Check List (For Author) - Test: Manual test - git diff --check - ./build.sh --be - build-support/run-clang-tidy.sh (fails on existing diagnostics in touched Doris headers/functions, including jni-util.h static_assert diagnostics and pre-existing complexity/modernize warnings) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: close #xxx Related PR: apache#64413 Problem Summary: FE now aligns scan slots to the storage schema and marks filled key columns explicitly. SegmentIterator no longer needs to keep a separate expression-column layout for pushed-down expressions, so this change removes the expression layout fallback from StorageReadOptions, RowsetReaderContext, and SegmentIterator. Pushed-down expression, ANN, index, score, and virtual-column paths now use the scan schema column ids directly. ### Release note None ### Check List (For Author) - Test: Manual test - ./build.sh --be - git diff --check - build-support/run-clang-tidy.sh --base HEAD (fails on pre-existing diagnostics in touched files, including jni-util.h static_assert(false), deprecated headers, and existing function complexity warnings) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#64413 Problem Summary: FE-filled key column schema keeps scan output aligned with the storage schema, so BE no longer needs to carry virtual-column block index/type mappings through scan and storage readers. This removes those mapping fields from OlapScanner, TabletReader, RowsetReaderContext, StorageReadOptions, and SegmentIterator, derives virtual column type from its expression context, routes common expression filtering through VExprContext::execute_conjuncts, and materializes virtual columns through VExprContext::execute. ### Release note None ### Check List (For Author) - Test: Manual test - build-support/clang-format.sh - git diff --check - ./build.sh --be - build-support/run-clang-tidy.sh --base HEAD (stopped after noisy existing full-file/header diagnostics; fixed the reported changed-line use-after-move in vcollect_iterator.cpp) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#64413 Problem Summary: FE-filled key columns are carried in the scan node thrift only when non-empty, and BE no longer keeps a separate LocalState copy of those filled key slot ids. OlapScanner reads the thrift field directly while initializing return columns and still records filled storage column ids for direct-mode readers to avoid reading FE-added key columns when storage merge is unnecessary. ### Release note None ### Check List (For Author) - Test: Manual test - build-support/clang-format.sh - git diff --check - ./build.sh --be --fe - build-support/run-clang-tidy.sh --base HEAD (failed on existing full-function/header/JNI static_assert diagnostics in the touched translation units) - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#64413 Problem Summary: After aligning scan slots with the storage key schema, the scan path still kept a separate filled-key scanner cache and a dedicated SegmentIterator helper for filled columns. This removes the redundant scanner member, records filled key columns directly in ReaderParams only for direct reads, and folds filled-column default generation into the generic prune path. Other BlockReader and SegmentIterator mapping state remains because it still maps expanded storage-read blocks, delete-predicate columns, virtual columns, and expression columns back to caller blocks. ### Release note None ### Check List (For Author) - Test: Manual test - build-support/clang-format.sh be/src/exec/scan/olap_scanner.cpp be/src/exec/scan/olap_scanner.h be/src/storage/segment/segment_iterator.cpp be/src/storage/segment/segment_iterator.h - git diff --check - ./build.sh --be - Behavior changed: No - Does this need documentation: No
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found two blocking correctness issues and am requesting changes.
Critical checkpoint conclusions:
- Goal/test: the PR aims to align OLAP scan schemas with storage keys for AGG and non-MOW UNIQUE tables, but the current implementation has correctness regressions in lazy materialization and common-expression filtering. I did not find tests covering those edge cases.
- Scope/focus: the PR diff is focused on scan-schema alignment and related BE reader cleanup.
- Concurrency/lifecycle/config: no new thread, lock, special lifecycle, or config-item risk was found in the PR diff.
- Compatibility/FE-BE variables: the new optional thrift field is directionally compatible and the main FE/BE send path is wired, but the FE projection remapping conflicts with the lazy-materialized scan path.
- Parallel paths/conditions: direct/non-direct scan paths were reviewed; the lazy-materialization path and common-expression pushdown path need fixes before this can be considered correct.
- Testing/result files:
git diff --check FETCH_HEAD..HEADpassed. Full Doris tests were not run in this review runner. Please add coverage for non-MOW UNIQUE scans with filled keys plus TopN lazy materialization, and for common expr pushdown when column 0 is not materialized before expression evaluation. - Data correctness/version/delete bitmap/memory/error handling: no visible-version, delete-bitmap, transaction, or memory-accounting issue was found in the touched code, but the two execution-path issues below can silently produce wrong query results. Status handling generally remains explicit.
- Observability/performance: no additional observability requirement was found; the intended performance optimization is reasonable after the correctness issues are addressed.
User focus: .code-review.2nSCsc/review_focus.txt contained no additional focus points.
| common_ctxs.push_back(ctx.get()); | ||
| } | ||
| _output_index_result_column(common_ctxs, _sel_rowid_idx.data(), _selected_size, block); | ||
| RETURN_IF_ERROR(_execute_common_expr(_sel_rowid_idx.data(), _selected_size, block)); |
There was a problem hiding this comment.
This drops rows for common-expression predicates when the first block column has not been materialized. _process_common_expr() can still be called with selected_size > 0 while block->rows() == 0: for example, when the common expr reads a later column and column 0 is neither a predicate/common-expr column nor otherwise output before this point. The removed code immediately above this hunk installed a temporary column 0 exactly to give the block selected_size rows. In release builds this new code builds an empty filter from block->rows(), _evaluate_common_expr_filter(..., selected_size, filter) returns 0, and the whole batch is silently filtered out. Please preserve a selected_size-sized row-count column, or otherwise make the conjunct execution/filtering operate on selected_size rows instead of block->rows().
| context.addScanNode(olapScanNode, olapScan); | ||
|
|
||
| translateRuntimeFilter(olapScan, olapScanNode, context); | ||
| if (!storageAlignedScanSlots.filledKeyExprIds.isEmpty()) { |
There was a problem hiding this comment.
This projection remaps every output ExprId in PlanTranslatorContext to the new projection tuple slot ids. That breaks visitPhysicalLazyMaterializeOlapScan: after computePhysicalOlapScan(lazyScan.getScan(), context) returns, it computes scanIds with context.findSlotRef(...) and then prunes olapScanNode.getTupleDesc() (the original scan tuple) using those ids. When a filled storage key is added for a non-MOW UNIQUE scan and another value column is lazy-materialized, the ids in scanIds are projection tuple ids, so the real scan tuple slots are removed while the project list still points at the old scan slots. Please avoid this projection/remap for the lazy-materialized scan path, or preserve/use the original scan tuple slot mapping when pruning.
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
TPC-DS: Total hot run time: 158309 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary:
This is an alternative to keeping separate storage and execution schemas in the BE segment iterator for storage expression pushdown.
For AGG key tables and non-MOW unique key tables, the planner now expands the OLAP scan tuple so the scan schema starts with the storage key columns in storage order. Missing key slots are marked in thrift as FE-filled key slots. The scan node projects back to the original execution output after storage filters and runtime filters are translated.
On the BE side, the marked key slots are mapped to tablet column ids. Direct reader paths can fill those storage-only key columns without reading their data pages, while merge and aggregation reader paths still keep real key columns for storage semantics. Since FE now provides the storage-aligned key prefix, the non-direct scanner no longer rebuilds the return column list as "all keys plus value outputs"; it only checks the key prefix and keeps the existing sequence-column expansion.
Release note
None
Check List (For Author)
build-support/clang-format.shgit diff --check./build.sh --fe./build.sh --bebuild-support/run-clang-tidy.sh --build-dir be/build_Release --base 23e21f44f0080e25deff5dcdda2e61af3efc9480was attempted, but this checkout reports existing file-level clang-tidy diagnostics in the touched storage files before producing a clean changed-line gate.