feat: Support TimestampNs and TimestampTzNs` in bucket transform#1150
feat: Support TimestampNs and TimestampTzNs` in bucket transform#1150liurenjie1024 merged 5 commits intoapache:mainfrom
TimestampNs and TimestampTzNs` in bucket transform#1150Conversation
liurenjie1024
left a comment
There was a problem hiding this comment.
Thanks @jonathanc-n for this pr, LGTM!
| .downcast_ref::<arrow_array::TimestampMicrosecondArray>() | ||
| .unwrap() | ||
| .unary(|v| self.bucket_timestamp(v)), | ||
| DataType::Time64(TimeUnit::Nanosecond) => input |
There was a problem hiding this comment.
in pyiceberg, TimestampNanoType in converted to micros precision before bucketing. This is to ensure that TimestampType and TimestampNanoType are transformed to the same value
I think we should add a test to ensure this here too
There was a problem hiding this comment.
Good catch, It wasn't performing this conversion. The test is added
|
@Fokko FYI new transform function we can use for apache/iceberg-python#1833 |
54d001b
| .as_any() | ||
| .downcast_ref::<arrow_array::Time64NanosecondArray>() | ||
| .unwrap() | ||
| .unary(|v| self.bucket_time(v / 1000)), |
There was a problem hiding this comment.
there was an issue with rounding when performing division for the hour transform (#1146)
but since we're dividing by 1000, i dont think the same issue applies here
| .transform_literal(&Datum::timestamp_nanos(ns_value)) | ||
| .unwrap() | ||
| .unwrap(), | ||
| Datum::int(79) |
There was a problem hiding this comment.
hm, since we're using the same time (1510871468000000) here
should the bucket value for bucket(100) transform be the same for this test and the test above (test_timestamptz_literal)?
There was a problem hiding this comment.
Yes I applied the conversion to the transform_literal as well
ZENOTME
left a comment
There was a problem hiding this comment.
LGTM! Thanks @jonathanc-n
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM thanks for adding the additional checks
liurenjie1024
left a comment
There was a problem hiding this comment.
Thanks @jonathanc-n for fixing this, and @kevinjqliu for the review
Which issue does this PR close?
What changes are included in this PR?
Add bucket transforms for
TimestampNsand TimestampTzNs`Are these changes tested?
Unit tests