Skip to content

Add explain support for analytics engine path#5275

Merged
dai-chen merged 12 commits intoopensearch-project:feature/mustang-ppl-integrationfrom
ahkcs:pr/mustang-explain-observability
Mar 27, 2026
Merged

Add explain support for analytics engine path#5275
dai-chen merged 12 commits intoopensearch-project:feature/mustang-ppl-integrationfrom
ahkcs:pr/mustang-explain-observability

Conversation

@ahkcs
Copy link
Copy Markdown
Collaborator

@ahkcs ahkcs commented Mar 26, 2026

Summary

Adds explain support for the analytics engine path (#5247). This is the second PR building on #5267 (query routing and execution handoff).

This PR covers Step 3 (explain)

Step 3: Explain support [Done]

Return logical RelNode plan via _plugins/_ppl/_explain, include physical plan if Analytics engine provides QueryPlanExecutor.explain() API.

  • AnalyticsExecutionEngine.explain(): Returns logical plan via RelOptUtil.toString() with mode-aware SqlExplainLevel (SIMPLE/STANDARD/COST). Physical and extended plans are null since the analytics engine is not yet available.

  • RestUnifiedQueryAction.explain(): New entry point for explain requests on the analytics path. Formats response via JsonResponseFormatter with normalizeLf().

  • TransportPPLQueryAction routing: Checks isExplainRequest() for analytics indices and routes to unifiedQueryHandler.explain() instead of execute().

  • AnalyticsExplainIT: Dedicated explain test class following CalciteExplainIT pattern. Compares actual explain JSON against expected output files in resources/expectedOutput/analytics/. Covers: simple scan, project, filter+project, aggregation, sort (with collation propagation), eval.

Explain response format

{
  "calcite": {
    "logical": "LogicalSystemLimit(...)\n  LogicalProject(ts=[$0], message=[$2])\n    LogicalFilter(condition=[=($1, 200)])\n      LogicalTableScan(table=[[opensearch, parquet_logs]])\n"
  }
}

Physical plan will be populated when the analytics engine provides a QueryPlanExecutor.explain() API.

Changed Files

File Change
core/.../analytics/AnalyticsExecutionEngine.java Added explain(RelNode, ExplainMode, ...) method
plugin/.../rest/RestUnifiedQueryAction.java Added explain() entry point and createExplainListener()
plugin/.../transport/TransportPPLQueryAction.java Route explain requests for analytics indices
integ-test/.../ppl/AnalyticsExplainIT.java New: 6 explain tests with expected output file comparison
integ-test/.../resources/expectedOutput/analytics/*.json New: 6 expected explain output files

Test plan

  • 2 unit tests -- explain logical plan, explain error propagation
  • 5 integration tests in AnalyticsPPLIT -- explain response structure, project+scan, filter, aggregation, sort
  • 6 integration tests in AnalyticsExplainIT -- JSON comparison against expected output files (scan, project, filter+project, aggregation, sort, eval)

opensearch-project#5247)

Add explain support for the analytics engine path:

- AnalyticsExecutionEngine.explain(RelNode, ExplainMode, ...): returns
  logical plan via RelOptUtil.toString() with mode-aware SqlExplainLevel
  (SIMPLE/STANDARD/COST). Physical and extended plans are null since the
  analytics engine is not yet available.

- RestUnifiedQueryAction.explain(): new entry point for explain requests,
  delegates to AnalyticsExecutionEngine.explain() with ExplainMode.STANDARD.
  Response formatted via JsonResponseFormatter with normalizeLf().

- TransportPPLQueryAction: routes explain requests for analytics indices
  to unifiedQueryHandler.explain() instead of unifiedQueryHandler.execute().

Integration tests (AnalyticsPPLIT):
- testExplainResponseStructure: verifies JSON shape (calcite.logical,
  calcite.physical=null, calcite.extended=null)
- testExplainProjectAndScan: LogicalProject + LogicalTableScan + table name
- testExplainFilterPlan: LogicalFilter with condition value
- testExplainAggregationPlan: LogicalAggregate with COUNT()
- testExplainSortPlan: LogicalSort

Unit tests (AnalyticsExecutionEngineTest):
- explainRelNode_returnsLogicalPlan
- explainRelNode_errorPropagation

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@ahkcs ahkcs added PPL Piped processing language enhancement New feature or request labels Mar 26, 2026
@ahkcs ahkcs changed the title [Mustang] Add explain support for analytics engine path Add explain support for analytics engine path Mar 26, 2026
Add dedicated explain integration test class following the CalciteExplainIT
pattern: each test compares actual explain JSON against expected output
files in resources/expectedOutput/analytics/. Tests cover simple scan,
project, filter+project, aggregation, sort (with collation propagation
to LogicalSystemLimit), and eval.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Use a real Calcite LogicalValues node instead of a mock so
RelOptUtil.toString() produces a deterministic plan. Assert the exact
expected logical plan string instead of just checking non-null.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Explain is thoroughly covered by AnalyticsExplainIT with expected output
file comparison (6 tests). Remove redundant explain unit tests and
unused captureExplainListener helper. Execute unit tests unchanged.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Explain is covered by AnalyticsExplainIT with expected output file
comparison (6 tests). Remove redundant explain tests and
extractLogicalPlan helper from AnalyticsPPLIT.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

plan, planContext, createQueryListener(queryType, planTime, listener));
if (isExplain) {
analyticsEngine.explain(
plan, ExplainMode.STANDARD, planContext, createExplainListener(listener));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExplainMode.STANDARD is hardcoded here, so the user's ?mode= parameter is ignored. pplRequest.mode() (PPLQueryRequest.java:103) already parses the request mode — pass it through instead of the constant so that simple, cost, and extended requests produce the correct SqlExplainLevel.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Now using pplRequest.mode() instead of hardcoded ExplainMode.STANDARD

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note down this as part of contract between analytics engine. Either they can provide explain API (e.g., QueryPlanExecutor.explain()) or we only show logical plan for this execution path and document it somewhere. Thanks!

- Add YAML format support to createExplainListener() by checking
  pplRequest.getFormat(), matching TransportPPLQueryAction pattern
- Switch AnalyticsExplainIT from explainQueryToString (JSON) to
  explainQueryYaml (YAML) with assertYamlEqualsIgnoreId
- Replace JSON expected files with YAML expected files
- YAML serializer omits null fields (physical/extended), so expected
  files only contain the logical plan

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Pass pplRequest.mode() to analyticsEngine.explain() so the user's
?mode= parameter (simple, standard, cost, extended) is respected.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Only apply normalizeLf() for YAML format, return response directly for
JSON format. Matches TransportPPLQueryAction.createExplainResponseListener()
which uses normalizeLf for YAML but returns response as-is for JSON.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@ahkcs ahkcs requested a review from penghuo March 27, 2026 18:36
Encapsulate the ExplainMode to SqlExplainLevel mapping in the enum
itself instead of repeating ternary logic in each execution engine.
AnalyticsExecutionEngine now uses mode.toExplainLevel() directly.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Remove LOG.info calls and Logger field. The YAML comparison assertion
already provides clear output on failure.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

…in formatting

Remove duplicate createExplainListener from RestUnifiedQueryAction.
explain() now returns ExplainResponse via ResponseListener, and
TransportPPLQueryAction wraps it with its existing
createExplainResponseListener for format-aware (JSON/YAML) formatting.
This avoids duplicating the format selection logic.

Signed-off-by: Kai Huang <kaihuang@amazon.com>
Signed-off-by: Kai Huang <ahkcs@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@ahkcs ahkcs requested a review from dai-chen March 27, 2026 20:54
Copy link
Copy Markdown
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@dai-chen dai-chen merged commit 40cb73b into opensearch-project:feature/mustang-ppl-integration Mar 27, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants