Remove DatetimeOutputCastRule on the analytics-engine route (#5420)#5454
Remove DatetimeOutputCastRule on the analytics-engine route (#5420)#5454mengweieric wants to merge 1 commit into
Conversation
PR Reviewer Guide 🔍(Review updated until commit cf16199)Here are some key observations to aid the review process:
|
…ch-project#5420) ## Problem On the analytics-engine route, every datetime root column is wrapped in `CAST(<DATE/TIME/TIMESTAMP> AS VARCHAR)` by `DatetimeOutputCastRule`. A matching `DatetimeOutputCastRewriter` on the engine side then translates that cast into DataFusion's `to_char` extension. Whenever the rewriter's format string and the PPL formatter disagree (e.g. trailing `Z`, `T` separator), users see wire-format divergence — issue opensearch-project#5420. ## Solution Drop the cast rule and let the engine return real datetime cells. The PPL response pipeline already handles datetime → string conversion natively at the formatter layer: - `AnalyticsExecutionEngine.convertRows` feeds engine cells (`LocalDateTime` / `LocalDate` / `LocalTime`) through `ExprValueUtils.fromObjectValue`, which produces `ExprTimestampValue` / `ExprDateValue` / `ExprTimeValue`. - Their `value()` methods render the documented PPL space-separated format (`yyyy-MM-dd HH:mm:ss[.SSSSSSSSS]` etc.). The companion change in `opensearch-project/OpenSearch` removes the matching `DatetimeOutputCastRewriter` and its callsites. ## Changes - Delete `DatetimeOutputCastRule`. - Drop the unused `UdtMapping.isDatetimeType` helper. - Rewrite the four cast-shape assertions in `DatetimeExtensionTest` to assert the post-removal RelNode (no `CAST(... VARCHAR)` wrapper, schema reports `DATE` / `TIME` / `TIMESTAMP`). Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
bfc54ca to
cf16199
Compare
|
Persistent review updated to latest commit cf16199 |
dai-chen
left a comment
There was a problem hiding this comment.
Thanks for the fix! Please check the CI failure.
Problem
On the analytics-engine route, every datetime root column is wrapped in
CAST(<DATE/TIME/TIMESTAMP> AS VARCHAR)byDatetimeOutputCastRule. A matchingDatetimeOutputCastRewriteron the engine side then translates that cast into DataFusion'sto_charextension. Whenever the rewriter's format string and the PPL formatter disagree (e.g. trailingZ,Tseparator), users see wire-format divergence — issue #5420.Solution
Drop the cast rule and let the engine return real datetime cells. The PPL response pipeline already handles datetime → string conversion natively at the formatter layer:
AnalyticsExecutionEngine.convertRowsfeeds engine cells (LocalDateTime/LocalDate/LocalTime) throughExprValueUtils.fromObjectValue, which producesExprTimestampValue/ExprDateValue/ExprTimeValue.value()methods render the documented PPL space-separated format (yyyy-MM-dd HH:mm:ss[.SSSSSSSSS]etc.).The companion change in opensearch-project/OpenSearch removes the matching
DatetimeOutputCastRewriterand its callsites.Changes
DatetimeOutputCastRule.UdtMapping.isDatetimeTypehelper.DatetimeExtensionTestto assert the post-removal RelNode (noCAST(... VARCHAR)wrapper, schema reportsDATE/TIME/TIMESTAMP).Test plan
:api:test --tests DatetimeExtensionTest— 8/8 pass:api:spotlessJavaCheck:integ-test:integTestRemoteagainst force-routed AE clusterFixes #5420