Reason or Problem
open_geotiff(window=...) only accepts a pixel window (row_start, col_start, row_stop, col_stop). Most geospatial users think in data-space coordinates: a bounding box in the file's CRS (x_min, y_min, x_max, y_max). Doing the conversion by hand means reading the file's transform, running the affine math, and clamping to file bounds. We already do exactly that internally via _extent_to_window in xrspatial/geotiff/_attrs.py:912, but it is not exposed on the public API.
A related gap: the PixelSafetyLimitError recovery hint that #2553 added mentions window= and chunks= but not a geographic bbox option. Once bbox= is on open_geotiff, the hint should mention it too so users with a known area of interest reach for the right kwarg.
Proposal
Add bbox=(x_min, y_min, x_max, y_max) to open_geotiff:
- Geographic coordinates in the file's CRS.
- Mutually exclusive with
window=. Passing both raises ValueError.
- Resolved internally via
_read_geo_info(source, overview_level=...) + _extent_to_window(geo_info.transform, h, w, y_min, y_max, x_min, x_max). The result is forwarded to the existing backend dispatch as window=.
- Requires the source to be georeferenced. A file without georef, or with a rotated affine that has not been cleared via
allow_rotated, raises a clear ValueError naming the limitation.
Then update the PixelSafetyLimitError recovery hint in xrspatial/geotiff/_layout.py:_recovery_hint to add a fourth bullet:
* Read a geographic sub-region with bbox=(x_min, y_min, x_max, y_max).
Design:
open_geotiff runs the bbox->window conversion before fanning out to backends. The dispatcher is the right place since it already coerces the source path once and dispatches once. Backends (read_geotiff_dask, read_geotiff_gpu, read_vrt, read_to_array) keep their existing window= surface; no new plumbing.
The conversion uses _read_geo_info, which already supports local files, BytesIO, HTTP, and fsspec URIs via a header-only read. For HTTP and cloud sources this is a small range request, not a full download, and the bytes are typically cached for the subsequent main read.
Usage:
arr = open_geotiff(
"s3://bucket/large-dem.tif",
bbox=(-122.5, 37.6, -122.3, 37.8), # San Francisco area, WGS84
)
Value: Removes the most common workaround in user code (manual affine math). Mirrors how rasterio's windows.from_bounds is used.
Stakeholders and Impacts
Anyone reading sub-regions of large GeoTIFFs by geographic coordinates. The new parameter is opt-in; existing window= callers see no behaviour change. The error hint added in #2554 expands by one line.
Drawbacks
The bbox->window resolution requires a metadata read before the main read. _read_geo_info is O(1) memory and fast for local files; for HTTP it adds one range request that is usually cached for the subsequent read.
Rotated transforms are rejected unless allow_rotated=True has cleared the rotation, since _extent_to_window assumes an axis-aligned grid. Same restriction the existing read path applies.
Alternatives
Exposing _extent_to_window as public API would work but pushes the pixel math onto every caller. The whole point of adding bbox= is to remove that step.
Reason or Problem
open_geotiff(window=...)only accepts a pixel window(row_start, col_start, row_stop, col_stop). Most geospatial users think in data-space coordinates: a bounding box in the file's CRS (x_min, y_min, x_max, y_max). Doing the conversion by hand means reading the file's transform, running the affine math, and clamping to file bounds. We already do exactly that internally via_extent_to_windowinxrspatial/geotiff/_attrs.py:912, but it is not exposed on the public API.A related gap: the
PixelSafetyLimitErrorrecovery hint that #2553 added mentionswindow=andchunks=but not a geographic bbox option. Oncebbox=is onopen_geotiff, the hint should mention it too so users with a known area of interest reach for the right kwarg.Proposal
Add
bbox=(x_min, y_min, x_max, y_max)toopen_geotiff:window=. Passing both raises ValueError._read_geo_info(source, overview_level=...)+_extent_to_window(geo_info.transform, h, w, y_min, y_max, x_min, x_max). The result is forwarded to the existing backend dispatch aswindow=.allow_rotated, raises a clear ValueError naming the limitation.Then update the
PixelSafetyLimitErrorrecovery hint inxrspatial/geotiff/_layout.py:_recovery_hintto add a fourth bullet:Design:
open_geotiffruns the bbox->window conversion before fanning out to backends. The dispatcher is the right place since it already coerces the source path once and dispatches once. Backends (read_geotiff_dask,read_geotiff_gpu,read_vrt,read_to_array) keep their existingwindow=surface; no new plumbing.The conversion uses
_read_geo_info, which already supports local files, BytesIO, HTTP, and fsspec URIs via a header-only read. For HTTP and cloud sources this is a small range request, not a full download, and the bytes are typically cached for the subsequent main read.Usage:
Value: Removes the most common workaround in user code (manual affine math). Mirrors how rasterio's
windows.from_boundsis used.Stakeholders and Impacts
Anyone reading sub-regions of large GeoTIFFs by geographic coordinates. The new parameter is opt-in; existing
window=callers see no behaviour change. The error hint added in #2554 expands by one line.Drawbacks
The bbox->window resolution requires a metadata read before the main read.
_read_geo_infois O(1) memory and fast for local files; for HTTP it adds one range request that is usually cached for the subsequent read.Rotated transforms are rejected unless
allow_rotated=Truehas cleared the rotation, since_extent_to_windowassumes an axis-aligned grid. Same restriction the existing read path applies.Alternatives
Exposing
_extent_to_windowas public API would work but pushes the pixel math onto every caller. The whole point of addingbbox=is to remove that step.