Skip to content

feat: Support Spark expression window_time#3732

Open
0lai0 wants to merge 3 commits intoapache:mainfrom
0lai0:expression_window_time
Open

feat: Support Spark expression window_time#3732
0lai0 wants to merge 3 commits intoapache:mainfrom
0lai0:expression_window_time

Conversation

@0lai0
Copy link
Contributor

@0lai0 0lai0 commented Mar 19, 2026

Which issue does this PR close?

Closes #3138

Rationale for this change

Comet previously did not support the Spark PreciseTimestampConversion expression.
This expression is not called directly by users, but is generated internally by Spark's Analyzer when it rewrites window_time(window_column) into a combination of GetStructField, Subtract, and PreciseTimestampConversion. Since Comet already supports GetStructField and Subtract but not PreciseTimestampConversion, queries using window_time would fall back to JVM execution.

What changes are included in this PR?

This change adds a Serde handler for PreciseTimestampConversion that reuses the existing Cast protobuf and Rust implementation. Since TimestampType (microseconds) and LongType share the same 64-bit memory layout in Arrow, a simple cast suffices without any new native code paths.

  • datetime.scala : Added CometPreciseTimestampConversion handler
  • QueryPlanSerde.scala : Registered the handler in temporalExpressions map
  • window_time.sql : Added window_time test

How are these changes tested?

Added a Scala unit test for window_time expression support in CometExpressionSuite

./mvnw test -Dsuites="org.apache.comet.CometExpressionSuite" -Dscala.useZincServer=false

@0lai0 0lai0 changed the title feat: Support Spark expression window time feat: Support Spark expression window_time Mar 19, 2026
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @0lai0
Triggering CI, please have SQL tests in separate sql files, similar to array_append.sql

"(cast('2023-01-01 12:15:00' as timestamp), 3)")

// basic window_time with aggregation
checkSparkAnswer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkSparkAnswer isn't checking that the query is actually running in Comet. However, it would be better to add the SQL based tests instead as @comphead pointed out

@0lai0
Copy link
Contributor Author

0lai0 commented Mar 20, 2026

Thanks @comphead and @andygrove for review and point this out. I'll add SQL tests in separate sql files and remove previous test. By the way, I discovered that using setcast is problematic, so I have updated it accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support Spark expression: window_time

3 participants