Feature Request / Improvement
When engines, such as Daft, read from the Table object (see scan_iceberg), it would be great if PyIceberg transparently handles time travel.
For example, to query an Iceberg table at a specific commit or timestamp, we can use PyIceberg to time travel to the particular snapshot-id or timestamp and then pass it into the engine.
There are several options to achieve this:
- Construct
Table object with the metadata of a specific Snapshot. Maybe a function like Table.as_of(snapshot_id/timestamp) -> Table. This will make time travel transparent to the engine.
- Pass the
Snapshot object to the engine. The function Table.snapshot_by_id -> Snapshot already exists, and represents a specific Iceberg commit. The engine will need to be able to read from both Snapshot and Table
Happy to explore other options as well.