geotiff: add bbox= to open_geotiff for geographic-space windowed reads#2556
Conversation
* ``open_geotiff(..., bbox=(x_min, y_min, x_max, y_max))`` resolves the geographic bbox to a pixel ``window=`` via the existing ``_read_geo_info`` + ``_extent_to_window`` helpers. The resolution is a header-only metadata read; the result is forwarded through the existing backend dispatch, so eager / dask / GPU / VRT all share one surface without learning a new kwarg. * Mutually exclusive with ``window=`` (raises ValueError when both are passed). Requires a georeferenced, axis-aligned source; rotated affines must be cleared via ``allow_rotated=True`` first. * The ``PixelSafetyLimitError`` recovery hint (added in #2554) gains a fourth bullet recommending ``bbox=`` for geographic sub-region reads. * ``bbox`` slots into ``_CANONICAL_ORDER`` immediately after ``window`` so the two sub-region kwargs stay grouped on the public surface. Closes #2555.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: geotiff: add bbox= to open_geotiff for geographic-space windowed reads
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
xrspatial/geotiff/__init__.py:399. NaN coordinates inbboxslip past thex_min >= x_maxandy_min >= y_maxguards becauseNaN >= NaNis False. They then land in_extent_to_window, where the affine math produces NaN row/col values and the integer cast fails with a confusingValueError. Add annp.isfinitecheck after unpacking and raise a clear bbox-malformed error. -
Missing coverage:
overview_leveltogether withbbox=. The helper takesoverview_leveland forwards it to_read_geo_info, which selects the overview IFD whose transform has differentpixel_width/pixel_heightthan the base. A test confirming thatopen_geotiff(path, bbox=..., overview_level=1)produces a pixel window appropriate for the overview's transform (not the base IFD's) would catch a future refactor that drops the plumbing.
Nits (optional improvements)
-
xrspatial/geotiff/__init__.py:418. The rotated-affine rejection branch (geo_info.transform.rotated_affine is not None) is untested. Constructing a rotated GeoTIFF takes a few lines viawrite(..., geo_transform=GeoTransform(rotated_affine=...)). Optional because the branch is straightforward and the message is clear. -
xrspatial/geotiff/tests/read/test_bbox_2555.py:39. The fixture usesnp.arange(...).reshape(100, 100)for pixel values, which is fine. Consider asserting the bbox slice's values match the equivalent slice of the full read (you already do this for shape and coords; pixel-value parity catches row/col swaps too).
What looks good
- Resolution happens once at the dispatcher (
open_geotiff), so all four backends (eager / dask / GPU / VRT) reuse the existingwindow=path without learning a new kwarg. _read_geo_infois the right primitive: O(1) memory, handles local / BytesIO / HTTP / fsspec uniformly.- Mutual exclusion with
window=is explicit and raises before any I/O. - The rotated-affine path routes users toward the existing
allow_rotated=Truerecovery rather than silently producing wrong pixel windows. _CANONICAL_ORDERis updated in the same PR with a comment explaining whybboxsits next towindow.- The safety hint's bullet list stays parallel and readable; the
chunks=line still appears last.
Checklist
- Algorithm matches reference (rasterio
windows.from_boundssemantics). Clamp-to-extent matches. - All implemented backends produce consistent results. Single dispatcher resolution.
- NaN handling. See Suggestion #1.
- Edge cases covered for arity / type / inverted axes / non-overlap; not for NaN or overview_level.
- No premature materialization or unnecessary copies.
- README feature matrix updated. N/A.
- Docstrings present and accurate.
* Reject non-finite bbox coordinates in _bbox_to_window with a clear
"must contain finite coordinates" error. NaN previously slipped past
the ordering check and surfaced as a confusing integer-cast failure
inside _extent_to_window.
* Reorder the rotated-affine vs no-georef guards. allow_rotated=True
drops the rotation AND sets has_georef=False, so checking has_georef
first hid the more specific rotated-affine message; now the rotated
path errors first with guidance to use window= for pixel-space
windowing.
* Tests:
- NaN bbox rejected on each axis (4 cases) and for +/-inf.
- Rotated-affine file (built via the existing test_crs writer)
rejected with the targeted message after allow_rotated=True.
- bbox + overview_level=N round-trip on a COG with 2x/4x reductions
confirms each successive level shrinks the bbox slice
monotonically, pinning the overview-aware resolution.
- Pixel-value parity vs the equivalent slice of the full read so a
row/col swap in the resolver would surface.
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review pass (commit 9c2933b)
Disposition
- Suggestion #1 (NaN check): Fixed.
_bbox_to_windownow rejects non-finite bbox coordinates with a "must contain finite coordinates" error before the ordering check. - Suggestion #2 (overview_level coverage): Fixed. New
test_bbox_with_overview_level_uses_overview_transformbuilds a 256x256 COG with 2x/4x reductions and confirms each successive level shrinks the bbox slice monotonically. Pins the overview-aware resolution against a future refactor that drops the plumbing. - Nit #1 (rotated-affine coverage): Fixed. New
test_bbox_rejects_rotated_affineuses the existingtest_crs._write_rotated_tiffhelper to build a rotated source and asserts the targeted "rotated affine" error. - Nit #2 (pixel-value parity): Fixed.
test_bbox_matches_equivalent_pixel_windownow also assertsarr.values == full.values[r0:r1, c0:c1], so a row/col swap in the resolver would surface.
Incidental fix uncovered while writing the rotated test
allow_rotated=True drops the rotation AND sets has_georef=False on the resulting GeoInfo. The original guard order (has_georef first) meant rotated-dropped files raised the generic "no GeoTIFF tags" error instead of the rotated-specific message. Reordered so the rotated_affine is not None check runs first; the user now gets guidance to fall back to window= for pixel-space windowing.
Tests
xrspatial/geotiff/tests/read/test_bbox_2555.py: 15 passed.- Full
xrspatial/geotiff/tests/suite: 6009 passed, 81 skipped, 1 xfailed.
Summary
open_geotiff(..., bbox=(x_min, y_min, x_max, y_max))reads a sub-region by geographic coordinates in the file's CRS.window=via the existing_read_geo_info+_extent_to_windowhelpers. That makes the conversion one header-only metadata read, forwarded through the existing backend dispatch with no new plumbing.window=. Requires a georeferenced, axis-aligned source.PixelSafetyLimitErrorrecovery hint (added in geotiff: include GB allocation estimate in max_pixels error message #2554) gains a fourth bullet:Read a geographic sub-region with bbox=(x_min, y_min, x_max, y_max) in the file's CRS.Closes #2555.
Backend coverage
CPU dispatcher change. numpy / cupy / dask+numpy / dask+cupy / VRT all reuse the existing
window=surface sincebboxis resolved at theopen_geotiffentry. No backend-specific code paths added.Test plan
test_bbox_2555.pycovers: bbox vs window equivalence, full-bbox round-trip, extent clamping, mutual exclusion withwindow=, malformedbbox(wrong arity, non-sequence, inverted x/y), out-of-extent rejection, and missing-georef rejection.bbox=line appears alongside the existingwindow=andchunks=lines.bboxslots into_CANONICAL_ORDERimmediately afterwindow; the canonical-order signature test passes.xrspatial/geotiff/tests/suite: 6005 passed, 81 skipped, 1 xfailed.