Skip to content

geotiff: add bbox= to open_geotiff for geographic-space windowed reads#2556

Merged
brendancol merged 3 commits into
mainfrom
issue-2555
May 28, 2026
Merged

geotiff: add bbox= to open_geotiff for geographic-space windowed reads#2556
brendancol merged 3 commits into
mainfrom
issue-2555

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • open_geotiff(..., bbox=(x_min, y_min, x_max, y_max)) reads a sub-region by geographic coordinates in the file's CRS.
  • Resolves to a pixel window= via the existing _read_geo_info + _extent_to_window helpers. That makes the conversion one header-only metadata read, forwarded through the existing backend dispatch with no new plumbing.
  • Mutually exclusive with window=. Requires a georeferenced, axis-aligned source.
  • The PixelSafetyLimitError recovery hint (added in geotiff: include GB allocation estimate in max_pixels error message #2554) gains a fourth bullet: Read a geographic sub-region with bbox=(x_min, y_min, x_max, y_max) in the file's CRS.

Closes #2555.

Backend coverage

CPU dispatcher change. numpy / cupy / dask+numpy / dask+cupy / VRT all reuse the existing window= surface since bbox is resolved at the open_geotiff entry. No backend-specific code paths added.

Test plan

  • New test_bbox_2555.py covers: bbox vs window equivalence, full-bbox round-trip, extent clamping, mutual exclusion with window=, malformed bbox (wrong arity, non-sequence, inverted x/y), out-of-extent rejection, and missing-georef rejection.
  • The safety-error recovery hint test asserts the new bbox= line appears alongside the existing window= and chunks= lines.
  • bbox slots into _CANONICAL_ORDER immediately after window; the canonical-order signature test passes.
  • Full xrspatial/geotiff/tests/ suite: 6005 passed, 81 skipped, 1 xfailed.

* ``open_geotiff(..., bbox=(x_min, y_min, x_max, y_max))`` resolves the
  geographic bbox to a pixel ``window=`` via the existing
  ``_read_geo_info`` + ``_extent_to_window`` helpers. The resolution is
  a header-only metadata read; the result is forwarded through the
  existing backend dispatch, so eager / dask / GPU / VRT all share one
  surface without learning a new kwarg.
* Mutually exclusive with ``window=`` (raises ValueError when both
  are passed). Requires a georeferenced, axis-aligned source; rotated
  affines must be cleared via ``allow_rotated=True`` first.
* The ``PixelSafetyLimitError`` recovery hint (added in #2554) gains a
  fourth bullet recommending ``bbox=`` for geographic sub-region reads.
* ``bbox`` slots into ``_CANONICAL_ORDER`` immediately after ``window``
  so the two sub-region kwargs stay grouped on the public surface.

Closes #2555.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 28, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: geotiff: add bbox= to open_geotiff for geographic-space windowed reads

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • xrspatial/geotiff/__init__.py:399. NaN coordinates in bbox slip past the x_min >= x_max and y_min >= y_max guards because NaN >= NaN is False. They then land in _extent_to_window, where the affine math produces NaN row/col values and the integer cast fails with a confusing ValueError. Add an np.isfinite check after unpacking and raise a clear bbox-malformed error.

  • Missing coverage: overview_level together with bbox=. The helper takes overview_level and forwards it to _read_geo_info, which selects the overview IFD whose transform has different pixel_width / pixel_height than the base. A test confirming that open_geotiff(path, bbox=..., overview_level=1) produces a pixel window appropriate for the overview's transform (not the base IFD's) would catch a future refactor that drops the plumbing.

Nits (optional improvements)

  • xrspatial/geotiff/__init__.py:418. The rotated-affine rejection branch (geo_info.transform.rotated_affine is not None) is untested. Constructing a rotated GeoTIFF takes a few lines via write(..., geo_transform=GeoTransform(rotated_affine=...)). Optional because the branch is straightforward and the message is clear.

  • xrspatial/geotiff/tests/read/test_bbox_2555.py:39. The fixture uses np.arange(...).reshape(100, 100) for pixel values, which is fine. Consider asserting the bbox slice's values match the equivalent slice of the full read (you already do this for shape and coords; pixel-value parity catches row/col swaps too).

What looks good

  • Resolution happens once at the dispatcher (open_geotiff), so all four backends (eager / dask / GPU / VRT) reuse the existing window= path without learning a new kwarg.
  • _read_geo_info is the right primitive: O(1) memory, handles local / BytesIO / HTTP / fsspec uniformly.
  • Mutual exclusion with window= is explicit and raises before any I/O.
  • The rotated-affine path routes users toward the existing allow_rotated=True recovery rather than silently producing wrong pixel windows.
  • _CANONICAL_ORDER is updated in the same PR with a comment explaining why bbox sits next to window.
  • The safety hint's bullet list stays parallel and readable; the chunks= line still appears last.

Checklist

  • Algorithm matches reference (rasterio windows.from_bounds semantics). Clamp-to-extent matches.
  • All implemented backends produce consistent results. Single dispatcher resolution.
  • NaN handling. See Suggestion #1.
  • Edge cases covered for arity / type / inverted axes / non-overlap; not for NaN or overview_level.
  • No premature materialization or unnecessary copies.
  • README feature matrix updated. N/A.
  • Docstrings present and accurate.

* Reject non-finite bbox coordinates in _bbox_to_window with a clear
  "must contain finite coordinates" error. NaN previously slipped past
  the ordering check and surfaced as a confusing integer-cast failure
  inside _extent_to_window.
* Reorder the rotated-affine vs no-georef guards. allow_rotated=True
  drops the rotation AND sets has_georef=False, so checking has_georef
  first hid the more specific rotated-affine message; now the rotated
  path errors first with guidance to use window= for pixel-space
  windowing.
* Tests:
  - NaN bbox rejected on each axis (4 cases) and for +/-inf.
  - Rotated-affine file (built via the existing test_crs writer)
    rejected with the targeted message after allow_rotated=True.
  - bbox + overview_level=N round-trip on a COG with 2x/4x reductions
    confirms each successive level shrinks the bbox slice
    monotonically, pinning the overview-aware resolution.
  - Pixel-value parity vs the equivalent slice of the full read so a
    row/col swap in the resolver would surface.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review pass (commit 9c2933b)

Disposition

  • Suggestion #1 (NaN check): Fixed. _bbox_to_window now rejects non-finite bbox coordinates with a "must contain finite coordinates" error before the ordering check.
  • Suggestion #2 (overview_level coverage): Fixed. New test_bbox_with_overview_level_uses_overview_transform builds a 256x256 COG with 2x/4x reductions and confirms each successive level shrinks the bbox slice monotonically. Pins the overview-aware resolution against a future refactor that drops the plumbing.
  • Nit #1 (rotated-affine coverage): Fixed. New test_bbox_rejects_rotated_affine uses the existing test_crs._write_rotated_tiff helper to build a rotated source and asserts the targeted "rotated affine" error.
  • Nit #2 (pixel-value parity): Fixed. test_bbox_matches_equivalent_pixel_window now also asserts arr.values == full.values[r0:r1, c0:c1], so a row/col swap in the resolver would surface.

Incidental fix uncovered while writing the rotated test

allow_rotated=True drops the rotation AND sets has_georef=False on the resulting GeoInfo. The original guard order (has_georef first) meant rotated-dropped files raised the generic "no GeoTIFF tags" error instead of the rotated-specific message. Reordered so the rotated_affine is not None check runs first; the user now gets guidance to fall back to window= for pixel-space windowing.

Tests

  • xrspatial/geotiff/tests/read/test_bbox_2555.py: 15 passed.
  • Full xrspatial/geotiff/tests/ suite: 6009 passed, 81 skipped, 1 xfailed.

@brendancol brendancol merged commit da96ab1 into main May 28, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: add bbox= parameter to open_geotiff for geographic-space window reads

1 participant