[refactor](be) Update column nullable semantics#64407
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Issue Number: N/A
Related PR: N/A
Problem Summary: `IColumn::is_nullable()` only identified physical `ColumnNullable`, while callers also need nullable semantics for `ColumnConst(ColumnNullable)`. The old split between semantic nullable checks and physical nullable checks made call sites ambiguous. This change removes `is_concrete_nullable()`, makes `ColumnConst` report nullable when its nested data column is nullable, and migrates physical-column branches to `is_column_nullable()` or `check_and_get_column<ColumnNullable>()`. Const nullable join keys and row-copy paths are materialized or unwrapped explicitly so null maps are propagated correctly.
None
- Test: Unit Test / Regression test / Build / Style
- `./run-be-ut.sh --run --filter=ColumnConstTest.* -j 90`
- `./run-be-ut.sh --run --filter=ColumnConstTest.*:RuntimeFilterWrapperTest.TestBitMap:HashJoinProbeOperatorTest.*:HashJoinBuildSinkTest.*:NLJAppendProbeDataWithNullTest.*:VIcebergDeleteSinkTest.*:IcebergDeleteFileReaderHelperTest.* -j 90`
- `./build.sh --be --fe`
- `PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/check-format.sh`
- `./run-regression-test.sh --run -f regression-test/suites/correctness_p0/test_runtime_filter_outer_join_nullable_side.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/correctness_p0/test_null_equal.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/correctness_p0/test_outer_join_with_null_value.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/query_p0/sql_functions/cast_function/test_nullable_functions.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/query_p0/sql_functions/array_functions/test_array_functions_by_literal.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/query_p0/sql_functions/json_functions/test_json_extract.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/query_p0/aggregate/map_agg_v1.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/dictionary_p0/test_dict_get_nullable.groovy`
- `./run-regression-test.sh --run -f regression-test/suites/schema_change_p0/test_alter_table_column_nullable.groovy`
- Behavior changed: Yes, internal column nullability predicate now treats const nullable as nullable; SQL behavior is intended to stay unchanged
- Does this need documentation: No
|
run buildall |
|
/review |
There was a problem hiding this comment.
Requesting changes. The PR goal is valid, but the nullable semantic flip is not complete enough to be safe yet.
Critical checkpoints:
- Goal alignment: Partially met. The change makes Const(Nullable) visible to semantic nullable checks, but some physical-nullable assumptions remain.
- Correctness/regressions: Blocking issues found. One parser path copies a one-row const null map as row-aligned, and at least one existing function family now casts ColumnConst as ColumnNullable.
- Tests: Existing tests cover many nullable paths, but they do not cover mixed vector plus const-nullable arguments for these failures. I did not run the full suite locally;
git diff --checkpassed. The observed BE UT macOS GitHub check failed before tests due a JDK 25 vs JDK 17 runner mismatch, so I did not treat that as a product regression. - Scope/compatibility: The blast radius is broad because
IColumn::is_nullable()is a core virtual method. Remaining unmodified callers must be audited before changing this contract. - Concurrency/lifecycle/locks: No new concurrency or lifecycle issue found in the reviewed BE paths.
- Config/FE-BE/protocol/security: No config, protocol, or security-boundary change found.
- Storage/txn/write paths: No transaction or write-consistency issue found beyond the nullable-physical-column checks reviewed in storage readers/schema change.
- Parallel/old-new paths: Hash join and nested loop join changes appear intentional, but parallel existing function paths are not all migrated.
- Performance/observability: No standalone performance or observability blocker found.
- User focus: No additional user-provided focus points were present.
TPC-H: Total hot run time: 29398 ms |
TPC-DS: Total hot run time: 169311 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
|
run buildall |
There was a problem hiding this comment.
I found two blocking correctness issues in dictionary loading. Both are separate dictionary-specific instances of the new ColumnConst(ColumnNullable) semantics and are not duplicates of the existing broad ColumnConst::is_nullable() review thread.
Critical checkpoints:
- Goal and behavior: The PR aims to make const nullable columns semantically nullable and migrate physical-column checks. That migration is incomplete in dictionary load paths, causing NULL keys/values to be mishandled.
- Correctness and edge cases: Constant nullable dictionary keys can be retained instead of skipped, and constant nullable attribute values can lose their null map and become non-null defaults.
- Scope and consistency: Most touched sites move toward explicit physical checks or const-aware null-map helpers, but the dictionary paths need the same const-aware handling.
- Concurrency/lifecycle/config/storage compatibility: I did not find new issues in those areas for the reviewed changes.
- Tests: The current coverage does not appear to exercise const-nullable dictionary keys or values; please add targeted coverage for these cases.
- User focus: No additional user-provided review focus was present in
.code-review.i1G8lP/review_focus.txt.
TPC-H: Total hot run time: 29454 ms |
TPC-DS: Total hot run time: 168469 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary:
IColumn::is_nullable()only identified physicalColumnNullable, while callers also need nullable semantics forColumnConst(ColumnNullable). The old split between semantic nullable checks and physical nullable checks made call sites ambiguous. This change removesis_concrete_nullable(), makesColumnConstreport nullable when its nested data column is nullable, and migrates physical-column branches tois_column_nullable()orcheck_and_get_column<ColumnNullable>(). Const nullable join keys and row-copy paths are materialized or unwrapped explicitly so null maps are propagated correctly.Release note
None
Check List (For Author)
./run-be-ut.sh --run --filter=ColumnConstTest.* -j 90./run-be-ut.sh --run --filter=ColumnConstTest.*:RuntimeFilterWrapperTest.TestBitMap:HashJoinProbeOperatorTest.*:HashJoinBuildSinkTest.*:NLJAppendProbeDataWithNullTest.*:VIcebergDeleteSinkTest.*:IcebergDeleteFileReaderHelperTest.* -j 90./build.sh --be --fePATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:$PATH build-support/check-format.sh./run-regression-test.sh --run -f regression-test/suites/correctness_p0/test_runtime_filter_outer_join_nullable_side.groovy./run-regression-test.sh --run -f regression-test/suites/correctness_p0/test_null_equal.groovy./run-regression-test.sh --run -f regression-test/suites/correctness_p0/test_outer_join_with_null_value.groovy./run-regression-test.sh --run -f regression-test/suites/query_p0/sql_functions/cast_function/test_nullable_functions.groovy./run-regression-test.sh --run -f regression-test/suites/query_p0/sql_functions/array_functions/test_array_functions_by_literal.groovy./run-regression-test.sh --run -f regression-test/suites/query_p0/sql_functions/json_functions/test_json_extract.groovy./run-regression-test.sh --run -f regression-test/suites/query_p0/aggregate/map_agg_v1.groovy./run-regression-test.sh --run -f regression-test/suites/dictionary_p0/test_dict_get_nullable.groovy./run-regression-test.sh --run -f regression-test/suites/schema_change_p0/test_alter_table_column_nullable.groovyWhat problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)