Skip to content

Enable logical planning for UPDATE ... FROM and preserve joined assignment qualifiers#21530

Open
kosiew wants to merge 7 commits intoapache:mainfrom
kosiew:update-01-19950
Open

Enable logical planning for UPDATE ... FROM and preserve joined assignment qualifiers#21530
kosiew wants to merge 7 commits intoapache:mainfrom
kosiew:update-01-19950

Conversation

@kosiew
Copy link
Copy Markdown
Contributor

@kosiew kosiew commented Apr 10, 2026

Which issue does this PR close?


Rationale for this change

Previously, UPDATE ... FROM statements were rejected early in the SQL planner, preventing even valid single-target update cases from reaching logical planning. This blocked progress toward full support for joined updates and made it difficult to incrementally implement the feature.

Additionally, the existing assignment extraction logic stripped column qualifiers unconditionally, which caused incorrect expressions when updates referenced joined sources (e.g. SET t1.a = t2.b).

This PR enables logical planning for UPDATE ... FROM and introduces a more precise analysis of the input plan so that assignment expressions and filters are handled correctly depending on whether joins are present.


What changes are included in this PR?

  • Planner enablement

    • Removed the early rejection of UPDATE ... FROM in the SQL layer.
    • Logical planner now produces joined update plans (e.g. via Cross Join + Filter).
  • Fail-closed physical planning

    • Added a guard in the physical planner to reject joined updates (UPDATE ... FROM) until execution support is implemented.
    • Ensures correctness by preventing unsafe execution paths.
  • DML input analysis

    • Introduced DmlInputAnalysis to:

      • Detect whether the input contains joins
      • Track valid target table references and aliases
    • Added analyze_dml_input and analyze_target_branch helpers.

  • Filter extraction improvements

    • Reworked filter extraction to use analysis-driven target reference tracking.
    • Ensures only predicates relevant to the target table are pushed down.
  • Assignment extraction overhaul

    • Refactored assignment extraction into:

      • find_update_projection
      • append_update_assignments
      • extract_update_assignments_with_analysis
    • Preserves qualifiers for joined-source expressions.

    • Strips qualifiers only for single-table updates to maintain provider compatibility.

    • Improved identity-assignment detection to respect table references.

  • SQL planning tests

    • Added logical plan tests verifying:

      • UPDATE ... FROM with joins
      • Alias handling
      • EXPLAIN UPDATE ... FROM
  • Physical planner and DML tests

    • Updated tests to assert fail-closed behavior for joined updates.

    • Added coverage for:

      • joined assignment preservation (t2.col)
      • alias-qualified references
      • same-name column disambiguation
      • self-joins
      • single-table qualifier stripping

Are these changes tested?

Yes.

This PR adds and updates comprehensive test coverage across both SQL and physical planning layers:

  • Logical plan tests confirm that UPDATE ... FROM is now accepted and produces the expected plan structure.

  • Physical planner tests verify that execution still fails with a clear error (UPDATE ... FROM is not supported).

  • Unit tests validate assignment extraction behavior for:

    • joined sources
    • aliases
    • column name conflicts
    • self-joins
    • single-table updates

These tests ensure both correctness and safe incremental rollout of the feature.


Are there any user-facing changes?

Yes.

  • UPDATE ... FROM is no longer rejected during SQL parsing/planning.

  • Users can now run EXPLAIN UPDATE ... FROM and inspect logical plans.

  • Execution of such queries still fails with:

    This feature is not implemented: UPDATE ... FROM is not supported
    

This provides improved visibility while maintaining safe behavior until full execution support is implemented.


LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

kosiew added 7 commits April 10, 2026 18:08
Restore support for single-source UPDATE ... FROM in the planner
by removing the rejection of early joined update plans. Move the
safety block to the physical planner to ensure joined updates are
safeguarded. Add test coverage for logical shapes and mock
schemas, and update execution regression tests to confirm
successful planning while maintaining fail-closed behavior.
Ensure EXPLAIN UPDATE ... FROM fails during SQL planning,
instead of misleadingly passing to physical_plan_error.
Maintain the physical-planner guard for direct execution
failures. Update joined assignment extraction to preserve
source references and avoid misclassifying columns in
single-table updates. Add regression tests in
sql_integration.rs and unit tests in physical_planner.rs.
Remove the EXPLAIN-time UPDATE ... FROM rejection in
statement.rs to allow the SQL planner to expose the joined
logical plan. Adjust regression test in sql_integration.rs
to assert the Explain -> Dml(Update) plan shape.

Consolidate duplicated projection-walking logic in
physical_planner.rs by using a shared helper function
for extract_update_assignments(). This simplifies
identity-check and qualifier-normalization rules.
Adjust update.slt to ensure both EXPLAIN UPDATE ... FROM
cases account for successful logical planning in addition
to the existing physical-planner rejection. Align Utf8View
cast with reports from sqllogictest in the filter for
better consistency.
Update the alias collection logic to only traverse the update
target branch, preventing self-join source aliases from being
confused with target aliases. Add a regression test ensuring
the correct assignment of src.a in the UPDATE statement
for improved accuracy in query execution.
Consolidate duplicated joined-update and target-alias walks in
physical_planner.rs by implementing a shared analyze_dml_input(...)
helper. Update the filter and assignment extraction to utilize this
common metadata. In sql_integration.rs, encapsulate the t1/t2
setup within a local UpdatePlanningContextProvider for new
joined-update planner tests, eliminating unnecessary table names
from the shared mock catalog in common/mod.rs.
Simplify physical planner by reusing DmlInputAnalysis and
centralizing projection lookup. Streamline assignment extraction
with iterators. Reduce duplication in SQL planning setup by
introducing a shared helper and improve context provider to
reuse stored schemas for efficiency. Enhance test scaffolding
with shared update schema and new assertion utilities.
@github-actions github-actions bot added sql SQL Planner core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Apr 10, 2026
@kosiew kosiew marked this pull request as ready for review April 10, 2026 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant