From dd1b7f3f991ee2742a1c54b9dbf7451aaddc0915 Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Mon, 25 May 2026 20:24:47 -0700 Subject: [PATCH 1/3] geotiff tests: consolidate release-gate registry Fold the 13 ``test_release_gate_*.py`` files (and the unmarked tests that lived alongside them) into a single registry at ``xrspatial/geotiff/tests/release_gates/test_stable_features.py``. The release engineer now has one audit point: ``pytest -m release_gate`` selects exactly this file and no others. Every test keeps ``@pytest.mark.release_gate``; tests previously living in a ``test_release_gate_*.py`` file but missing the marker (overview sidecar metadata, rotated negative cases, the #2321 meta-gates) pick it up here. Helper-function name collisions across the source files (``_write_known_good`` in four files, ``_make_data_array`` in two) are resolved with section prefixes so the consolidation does not introduce cross-section coupling. ``docs/source/reference/release_gate_geotiff.rst`` and the smaller ``docs/source/reference/geotiff.rst`` references now point at the consolidated module. The per-feature grouping in the doc is preserved in prose even though the underlying file is one. PR 10 of 11 in epic #2390. Closes #2403. --- docs/source/reference/geotiff.rst | 6 +- .../source/reference/release_gate_geotiff.rst | 79 +- xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md | 98 + .../geotiff/tests/release_gates/__init__.py | 8 + .../release_gates/test_stable_features.py | 2593 +++++++++++++++++ .../test_release_contract_parity_2389.py | 13 +- .../geotiff/tests/test_release_gate_2321.py | 221 -- .../tests/test_release_gate_attrs_contract.py | 154 - ...test_release_gate_codec_round_trip_2341.py | 375 --- .../geotiff/tests/test_release_gate_codecs.py | 132 - .../geotiff/tests/test_release_gate_cog.py | 160 - .../tests/test_release_gate_dask_parity.py | 150 - ...est_release_gate_eager_dask_parity_2341.py | 317 -- .../tests/test_release_gate_local_read.py | 177 -- .../tests/test_release_gate_local_write.py | 156 - .../tests/test_release_gate_negative_2341.py | 500 ---- ...ase_gate_overview_sidecar_metadata_2341.py | 468 --- .../tests/test_release_gate_windowed_read.py | 162 - .../test_release_gate_windowed_reads_2341.py | 473 --- 19 files changed, 2756 insertions(+), 3486 deletions(-) create mode 100644 xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md create mode 100644 xrspatial/geotiff/tests/release_gates/__init__.py create mode 100644 xrspatial/geotiff/tests/release_gates/test_stable_features.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_2321.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_attrs_contract.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_codec_round_trip_2341.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_codecs.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_cog.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_dask_parity.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_eager_dask_parity_2341.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_local_read.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_local_write.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_negative_2341.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_overview_sidecar_metadata_2341.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_windowed_read.py delete mode 100644 xrspatial/geotiff/tests/test_release_gate_windowed_reads_2341.py diff --git a/docs/source/reference/geotiff.rst b/docs/source/reference/geotiff.rst index d6511587..eff1c0c8 100644 --- a/docs/source/reference/geotiff.rst +++ b/docs/source/reference/geotiff.rst @@ -134,7 +134,8 @@ silently emit a mislabeled raster: * Rotated read without ``allow_rotated=True`` -- raises across eager, dask, and windowed paths - (``xrspatial/geotiff/tests/test_release_gate_negative_2341.py``). + (``xrspatial/geotiff/tests/release_gates/test_stable_features.py``, + ``Negative cases`` section). * Rotated write without ``drop_rotation=True`` -- raises ``ValueError`` (``xrspatial/geotiff/tests/test_to_geotiff_drop_rotation_2216.py``). * Rotated or skewed source inside a VRT -- raises at parse @@ -537,7 +538,8 @@ regression test that locks the behaviour. ``xrspatial/geotiff/tests/test_unsupported_features_2349.py`` (``test_mixed_per_source_nodata_rejected``) * - Rotated read without ``allow_rotated=True`` - - ``xrspatial/geotiff/tests/test_release_gate_negative_2341.py``, + - ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (``Negative cases`` section), ``xrspatial/geotiff/tests/test_rotated_typed_error_2267.py`` * - Rotated write without ``drop_rotation=True`` - ``xrspatial/geotiff/tests/test_to_geotiff_drop_rotation_2216.py``, diff --git a/docs/source/reference/release_gate_geotiff.rst b/docs/source/reference/release_gate_geotiff.rst index b616af39..4f71c863 100644 --- a/docs/source/reference/release_gate_geotiff.rst +++ b/docs/source/reference/release_gate_geotiff.rst @@ -44,24 +44,29 @@ shortest invocation is: pytest xrspatial/geotiff/tests/ -To run only the release-gate-tagged subset that backs this checklist (the -files named ``test_release_gate_*.py``), use: +To run only the release-gate-tagged subset that backs this checklist, use +the ``release_gate`` marker. Every gate test lives under a single registry +file (``xrspatial/geotiff/tests/release_gates/test_stable_features.py``) +so the selector picks the registry up exactly: .. code-block:: bash - pytest xrspatial/geotiff/tests/ -k release_gate + pytest xrspatial/geotiff/tests/release_gates/ -m release_gate + +The ``-m release_gate`` selector also works from the wider tests root and +returns the same set of tests. GPU rows live behind the standard CUDA fixtures. They auto-skip when ``cupy`` or a CUDA device is unavailable, so the same command runs on a CPU-only host. The GPU rows are tagged ``experimental``; their skip is not a release blocker (see the decision rule below). -The cross-cutting meta-gates (``test_release_gate_2321.py``, -``test_release_gate_negative_2341.py``, -``test_supported_features_tiers_2137.py``) are part of the same suite. They -fail if a row in this checklist names a feature key that is missing from -:data:`xrspatial.geotiff.SUPPORTED_FEATURES` or a test file that does not -exist. +The cross-cutting meta-gates (the consolidated +``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` plus +``test_supported_features_tiers_2137.py``) are part of the same suite. +They fail if a row in this checklist names a feature key that is missing +from :data:`xrspatial.geotiff.SUPPORTED_FEATURES` or a test file that +does not exist. Handling skipped rows --------------------- @@ -77,12 +82,13 @@ A skipped row is not the same as a passing row. Before signing off: Treat ``ImportError``, ``ModuleNotFoundError``, or environment-error skips inside the gate suite as failures unless the row is already tagged ``experimental``. -* The ``xfail`` rows in ``test_release_gate_negative_2341.py`` are - intentional pins for follow-up work (see that file's docstring). A - newly-passing ``xfail`` is also a signal: the linked follow-up has - landed, the row should be re-tiered in this PR, and the - ``xfail`` marker on the test should be removed in the same commit - so the gate cannot silently regress. +* The ``xfail`` rows inside the ``Negative cases`` section of + ``release_gates/test_stable_features.py`` are intentional pins for + follow-up work (see the in-file section docstring). A newly-passing + ``xfail`` is also a signal: the linked follow-up has landed, the row + should be re-tiered in this PR, and the ``xfail`` marker on the test + should be removed in the same commit so the gate cannot silently + regress. Promote / demote decision rule ------------------------------ @@ -175,7 +181,8 @@ Local GeoTIFF read and write drift), and the canonical non-transform release attrs unchanged. Covered for both ``open_geotiff(window=...)`` and ``read_geotiff_dask(window=...)``. - - ``xrspatial/geotiff/tests/test_release_gate_windowed_reads_2341.py`` + - ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (windowed-reads section) - `#2341`_ * - ``reader.dask`` - stable @@ -194,7 +201,8 @@ Local GeoTIFF read and write four scenarios: integer-nodata, float-NaN-nodata, MinIsWhite, and the ``mask_nodata=False`` raw-sentinel branch of the nodata lifecycle. - - ``xrspatial/geotiff/tests/test_release_gate_eager_dask_parity_2341.py`` + - ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (eager / dask full parity section) - `#2341`_ * - ``writer.local_file`` - stable @@ -239,7 +247,8 @@ Local GeoTIFF read and write write / read / write / read cycle preserves byte-exact pixels (NaN-aware for float) and the canonical release attrs. See the cited test for the codec, dtype, and attr-key matrix. - - ``xrspatial/geotiff/tests/test_release_gate_codec_round_trip_2341.py`` + - ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (codec round-trip section) - `#2341`_ * - Codec ``lerc`` / ``jpeg2000`` / ``j2k`` / ``lz4`` - experimental @@ -346,7 +355,7 @@ HTTP / fsspec reads ``XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1`` is set. - ``xrspatial/geotiff/tests/test_ssrf_hardening_1664.py``, ``xrspatial/geotiff/tests/test_dns_rebinding_pin_issue_1846.py``, - ``xrspatial/geotiff/tests/test_release_gate_2321.py`` + ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` (HTTP SSRF presence gate) - `#2344`_ * - ``reader.http_cog`` -- per-tile byte-count cap @@ -563,7 +572,8 @@ VRT supported subset - stable - At least one regression test exists for every promised VRT behaviour (this row is a meta-gate on the rows above). - - ``xrspatial/geotiff/tests/test_release_gate_2321.py`` + - ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (VRT presence meta-gate) - `#2321`_ * - ``write_vrt`` - advanced @@ -598,7 +608,8 @@ Sidecar and overview interactions ``masked_nodata``; ``transform`` scales pixel size by the level factor with the origin preserved. Covered through the eager and dask read paths. - - ``xrspatial/geotiff/tests/test_release_gate_overview_sidecar_metadata_2341.py`` + - ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (overview / sidecar metadata section) - `#2341`_ * - Remote sidecar byte order - stable @@ -702,18 +713,20 @@ These gates are not tier rows but they back the rest of the checklist. ``test_backend_pixel_parity_matrix_1813.py`` -- cross-backend pixel and metadata parity across the 4 read backends (numpy, cupy, dask+numpy, dask+cupy) on the golden corpus. Owning epic: `#2341`_. -* ``test_release_gate_2321.py`` -- meta-gate that asserts every promised - VRT behaviour in this checklist resolves to a real test file and a real - ``SUPPORTED_FEATURES`` entry. Owning epic: `#2321`_. -* ``xrspatial/geotiff/tests/test_release_gate_negative_2341.py`` -- - negative cross-cutting gate. Pins that ambiguous metadata fails closed - at every promised read entry point: conflicting CRS between header and - ``.aux.xml`` PAM sidecar (xfail until PAM sidecar support lands), - integer nodata sentinel that cannot be honoured on a float-promoted - raster (xfail against ``#1774`` follow-up), rotated transform without - ``allow_rotated=True`` uniformly across eager / dask / windowed paths, - and mixed-tier VRT children when stable-only is requested (xfail - against epic `#2342`_). Owning epic: `#2341`_. +* ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (``Cross-cutting meta-gates`` section) -- meta-gate that asserts every + promised VRT behaviour in this checklist resolves to a real test file + and a real ``SUPPORTED_FEATURES`` entry. Owning epic: `#2321`_. +* ``xrspatial/geotiff/tests/release_gates/test_stable_features.py`` + (``Negative cases`` section) -- negative cross-cutting gate. Pins that + ambiguous metadata fails closed at every promised read entry point: + conflicting CRS between header and ``.aux.xml`` PAM sidecar (xfail + until PAM sidecar support lands), integer nodata sentinel that cannot + be honoured on a float-promoted raster (xfail against ``#1774`` + follow-up), rotated transform without ``allow_rotated=True`` uniformly + across eager / dask / windowed paths, and mixed-tier VRT children when + stable-only is requested (xfail against epic `#2342`_). Owning epic: + `#2341`_. Owning epics ============ diff --git a/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md new file mode 100644 index 00000000..d5f643b8 --- /dev/null +++ b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md @@ -0,0 +1,98 @@ +# CLUSTER_AUDIT_PR10 — Release-gate registry + +This audit table maps every test currently living in a +`test_release_gate_*.py` file under `xrspatial/geotiff/tests/` to its new +home inside the single consolidated registry, +`release_gates/test_stable_features.py`. Deleted before merge per the +epic protocol (see `xarray-contrib/xarray-spatial#2390`). + +## Inputs + +13 source files, 159 tests collected, 134 of those previously carried +`@pytest.mark.release_gate`. The remaining 25 lived in +`test_release_gate_*.py` files but did not carry the marker; the epic +specifies all such tests fold in and pick up the marker. + +## Mapping + +| Old file:test | New `release_gates/test_stable_features.py::test_id` | Notes | +|---|---|---| +| `test_release_gate_local_read.py::test_release_gate_local_read_pixels` | `test_release_gate_local_read_pixels` | unchanged | +| `test_release_gate_local_read.py::test_release_gate_local_read_crs` | `test_release_gate_local_read_crs` | unchanged | +| `test_release_gate_local_read.py::test_release_gate_local_read_transform` | `test_release_gate_local_read_transform` | unchanged | +| `test_release_gate_local_read.py::test_release_gate_local_read_nodata` | `test_release_gate_local_read_nodata` | unchanged | +| `test_release_gate_local_write.py::test_release_gate_local_write_round_trips_pixels` | `test_release_gate_local_write_round_trips_pixels` | unchanged | +| `test_release_gate_local_write.py::test_release_gate_local_write_preserves_crs` | `test_release_gate_local_write_preserves_crs` | unchanged | +| `test_release_gate_local_write.py::test_release_gate_local_write_preserves_transform` | `test_release_gate_local_write_preserves_transform` | unchanged | +| `test_release_gate_local_write.py::test_release_gate_local_write_preserves_nodata` | `test_release_gate_local_write_preserves_nodata` | unchanged | +| `test_release_gate_codecs.py::test_release_gate_codec_round_trip_uint16[codec]` (5) | `test_release_gate_codec_round_trip_uint16[codec]` | parametrized over `STABLE_LOSSLESS_CODECS` | +| `test_release_gate_codecs.py::test_release_gate_codec_round_trip_float32[codec]` (5) | `test_release_gate_codec_round_trip_float32[codec]` | parametrized over `STABLE_LOSSLESS_CODECS` | +| `test_release_gate_codecs.py::test_release_gate_codec_stable_set_matches_supported_features` | `test_release_gate_codec_stable_set_matches_supported_features` | unchanged | +| `test_release_gate_cog.py::test_release_gate_cog_round_trips_pixels[codec]` (5) | `test_release_gate_cog_round_trips_pixels[codec]` | shared `STABLE_LOSSLESS_CODECS` constant, no cross-file import | +| `test_release_gate_cog.py::test_release_gate_cog_preserves_crs_transform[codec]` (5) | `test_release_gate_cog_preserves_crs_transform[codec]` | unchanged | +| `test_release_gate_cog.py::test_release_gate_cog_preserves_nodata[codec]` (5) | `test_release_gate_cog_preserves_nodata[codec]` | unchanged | +| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_returns_subset` | `test_release_gate_windowed_read_returns_subset` | unchanged | +| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_preserves_crs` | `test_release_gate_windowed_read_preserves_crs` | unchanged | +| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_shifts_transform_origin` | `test_release_gate_windowed_read_shifts_transform_origin` | unchanged | +| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_full_extent_matches_unwindowed` | `test_release_gate_windowed_read_full_extent_matches_unwindowed` | unchanged | +| `test_release_gate_dask_parity.py::test_release_gate_dask_read_matches_eager_pixels` | `test_release_gate_dask_read_matches_eager_pixels` | unchanged | +| `test_release_gate_dask_parity.py::test_release_gate_dask_read_matches_eager_attrs` | `test_release_gate_dask_read_matches_eager_attrs` | unchanged | +| `test_release_gate_dask_parity.py::test_release_gate_dask_read_is_lazy` | `test_release_gate_dask_read_is_lazy` | unchanged | +| `test_release_gate_eager_dask_parity_2341.py::test_release_gate_eager_dask_full_parity[fixture]` (4) | `test_release_gate_eager_dask_full_parity[fixture]` | corpus list preserved | +| `test_release_gate_eager_dask_parity_2341.py::test_release_gate_corpus_is_non_empty` | `test_release_gate_corpus_is_non_empty` | now carries `@pytest.mark.release_gate` (previously unmarked despite living in a `test_release_gate_*.py` file) | +| `test_release_gate_attrs_contract.py::test_release_gate_attrs_canonical_keys_present` | `test_release_gate_attrs_canonical_keys_present` | unchanged | +| `test_release_gate_attrs_contract.py::test_release_gate_attrs_georef_status_full` | `test_release_gate_attrs_georef_status_full` | unchanged | +| `test_release_gate_attrs_contract.py::test_release_gate_attrs_contract_version_is_int` | `test_release_gate_attrs_contract_version_is_int` | unchanged | +| `test_release_gate_attrs_contract.py::test_release_gate_attrs_round_trip_preserves_crs_transform_nodata` | `test_release_gate_attrs_round_trip_preserves_crs_transform_nodata` | unchanged | +| `test_release_gate_codec_round_trip_2341.py::test_release_gate_codec_round_trip[codec-dtype]` (20) | `test_release_gate_codec_round_trip[codec-dtype]` | unchanged | +| `test_release_gate_codec_round_trip_2341.py::test_release_gate_codec_round_trip_stable_set_matches_supported_features` | `test_release_gate_codec_round_trip_stable_set_matches_supported_features` | unchanged | +| `test_release_gate_overview_sidecar_metadata_2341.py::test_cog_internal_overview_metadata_survives[reader]` (2) | `test_release_gate_cog_internal_overview_metadata_survives[reader]` | renamed for the `release_gate_` test-name prefix; now carries `@pytest.mark.release_gate` | +| `test_release_gate_overview_sidecar_metadata_2341.py::test_cog_internal_overview_transform_scales[reader]` (2) | `test_release_gate_cog_internal_overview_transform_scales[reader]` | renamed; marker added | +| `test_release_gate_overview_sidecar_metadata_2341.py::test_cog_internal_overview_shape_matches_factors` | `test_release_gate_cog_internal_overview_shape_matches_factors` | renamed; marker added | +| `test_release_gate_overview_sidecar_metadata_2341.py::test_sidecar_overview_metadata_survives[reader]` (2) | `test_release_gate_sidecar_overview_metadata_survives[reader]` | renamed; marker added | +| `test_release_gate_overview_sidecar_metadata_2341.py::test_sidecar_overview_transform_scales[reader]` (2) | `test_release_gate_sidecar_overview_transform_scales[reader]` | renamed; marker added | +| `test_release_gate_overview_sidecar_metadata_2341.py::test_sidecar_overview_shape_matches_factors` | `test_release_gate_sidecar_overview_shape_matches_factors` | renamed; marker added | +| `test_release_gate_overview_sidecar_metadata_2341.py::test_internal_vs_sidecar_metadata_agree[reader]` (2) | `test_release_gate_internal_vs_sidecar_metadata_agree[reader]` | renamed; marker added | +| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_shape[r-corpus-window]` (16) | `test_release_gate_windowed_read_shape[r-corpus-window]` | corpus fixture renamed to `_wsp_corpus_file`; same IDs | +| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_coords_slice[r-corpus-window]` (16) | `test_release_gate_windowed_read_coords_slice[r-corpus-window]` | same | +| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_transform_shifted[r-corpus-window]` (16) | `test_release_gate_windowed_read_transform_shifted[r-corpus-window]` | same | +| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_canonical_attrs_unchanged[r-corpus-window]` (16) | `test_release_gate_windowed_read_canonical_attrs_unchanged[r-corpus-window]` | same | +| `test_release_gate_negative_2341.py::test_release_gate_negative_conflicting_aux_xml_crs` | `test_release_gate_negative_conflicting_aux_xml_crs` | unchanged; remains `xfail strict=False` | +| `test_release_gate_negative_2341.py::test_release_gate_negative_integer_nodata_float_promoted` | `test_release_gate_negative_integer_nodata_float_promoted` | unchanged | +| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_eager` | `test_release_gate_negative_rotated_eager` | now carries `@pytest.mark.release_gate` (previously unmarked) | +| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_dask` | `test_release_gate_negative_rotated_dask` | marker added | +| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_windowed` | `test_release_gate_negative_rotated_windowed` | marker added | +| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_gpu` | `test_release_gate_negative_rotated_gpu` | marker added; `requires_gpu` imported from `_helpers.markers` instead of the slim conftest re-export | +| `test_release_gate_negative_2341.py::test_release_gate_negative_mixed_tier_vrt_children` | `test_release_gate_negative_mixed_tier_vrt_children` | unchanged | +| `test_release_gate_2321.py::test_release_gate_cites_only_existing_test_files` | `test_release_gate_cites_only_existing_test_files` | now carries `@pytest.mark.release_gate`; self-reference path updated to `release_gates/test_stable_features.py` | +| `test_release_gate_2321.py::test_release_gate_lists_every_promised_supported_feature` | `test_release_gate_lists_every_promised_supported_feature` | marker added | +| `test_release_gate_2321.py::test_release_gate_http_ssrf_rejects_loopback` | `test_release_gate_http_ssrf_rejects_loopback` | marker added | +| `test_release_gate_2321.py::test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme` | `test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme` | marker added; xfail kept | +| `test_release_gate_2321.py::test_release_gate_vrt_rows_point_at_real_test_functions` | `test_release_gate_vrt_rows_point_at_real_test_functions` | marker added | + +## Helper-function collisions + +Two source files defined `_write_known_good` and two defined +`_make_data_array`. Helpers carry section prefixes in the consolidated +file (`_local_read_write_known_good`, `_local_write_make_data_array`, +`_dask_parity_write_known_good`, `_attrs_write_known_good`, +`_cog_make_data_array`, etc.) so the consolidation does not introduce +cross-section coupling. + +## Drops / dismissals + +None. Every test from every folded file moved. The `release_gate` +marker now covers all 159 tests rather than the previous 134. + +## Verification + +``` +pytest xrspatial/geotiff/tests/release_gates/ -v -m release_gate +# 155 passed, 3 xfailed, 1 xpassed +pytest xrspatial/geotiff/tests/ -m release_gate -v +# same 159 tests selected -- no other file carries the marker now +``` + +`-m release_gate` from the wider tests root resolves to the single +registry file. Deletion: this file is removed in the final commit +before merge. diff --git a/xrspatial/geotiff/tests/release_gates/__init__.py b/xrspatial/geotiff/tests/release_gates/__init__.py new file mode 100644 index 00000000..672afe58 --- /dev/null +++ b/xrspatial/geotiff/tests/release_gates/__init__.py @@ -0,0 +1,8 @@ +"""Release-gate test registry. + +Every ``@pytest.mark.release_gate`` test for the GeoTIFF surface lives +under this package. The release process's audit point is one module: +``test_stable_features.py`` -- run ``pytest xrspatial/geotiff/tests/ +release_gates/ -m release_gate`` to execute the union of every promise +the release notes are allowed to make. +""" diff --git a/xrspatial/geotiff/tests/release_gates/test_stable_features.py b/xrspatial/geotiff/tests/release_gates/test_stable_features.py new file mode 100644 index 00000000..0cdd8258 --- /dev/null +++ b/xrspatial/geotiff/tests/release_gates/test_stable_features.py @@ -0,0 +1,2593 @@ +"""Single-file registry of every ``@pytest.mark.release_gate`` test. + +The release process needs one audit point for "what does CI run when +validating release readiness". This module is that point: every test +that backs a row in ``docs/source/reference/release_gate_geotiff.rst`` +lives here. + +Organisation +============ + +Tests are grouped by feature area: + +* ``Local read`` (``reader.local_file``, stable) +* ``Local write`` (``writer.local_file``, stable) +* ``Codecs`` (the stable lossless codec set) +* ``COG`` (``writer.cog`` / ``reader.local_cog``, stable) +* ``Windowed read`` (``reader.windowed``, stable) +* ``Dask parity`` (``reader.dask``, stable) +* ``Eager / dask full parity`` (epic #2341, four corpus scenarios) +* ``Attrs contract`` (canonical attrs round-trip, stable) +* ``Codec round-trip`` (cartesian stable codec x dtype, epic #2341) +* ``Overview / sidecar metadata`` (overview level attrs survive) +* ``Windowed reads -- shifted-transform parity`` (epic #2341) +* ``Negative cases`` (ambiguous metadata fails closed) +* ``Cross-cutting meta-gates`` (#2321 -- checklist parity) + +Each section's helper functions are private to that section +(``_
_``) so the consolidation does not introduce +cross-section coupling. + +History +======= + +This file replaces the per-cluster ``test_release_gate_*.py`` files that +used to live alongside it. Filenames carrying issue numbers +(``_2321``, ``_2341``, ...) are gone -- the git log is the audit trail +for which PR introduced each gate. The ``release_gate`` marker is the +single signal a release engineer keys on: + +.. code-block:: bash + + pytest xrspatial/geotiff/tests/release_gates/ -m release_gate +""" +from __future__ import annotations + +import importlib.util +import os +import re +import struct +import uuid +from pathlib import Path +from typing import Any + +import numpy as np +import pytest +import xarray as xr + +from xrspatial.geotiff import ( + SUPPORTED_FEATURES, + UnsafeURLError, + open_geotiff, + read_geotiff_dask, + to_geotiff, +) +from xrspatial.geotiff._compression import ( + COMPRESSION_DEFLATE, + COMPRESSION_LZW, + COMPRESSION_NONE, + COMPRESSION_PACKBITS, + COMPRESSION_ZSTD, +) +from xrspatial.geotiff._errors import ( + GeoTIFFAmbiguousMetadataError, + RotatedTransformError, +) +from xrspatial.geotiff._geotags import GeoTransform +from xrspatial.geotiff._header import parse_header, parse_ifd +from xrspatial.geotiff._writer import write +from xrspatial.geotiff.tests._helpers.markers import requires_gpu + +# =========================================================================== # +# Shared codec constants # +# =========================================================================== # +# +# The stable lossless codec set is named here once so the codec, COG, and +# codec-round-trip sections below all share it. Keeps the cross-check +# against ``SUPPORTED_FEATURES`` in one place too. + +STABLE_LOSSLESS_CODECS = ("none", "deflate", "lzw", "packbits", "zstd") + +_CODEC_TO_TIFF_TAG = { + "none": COMPRESSION_NONE, + "deflate": COMPRESSION_DEFLATE, + "lzw": COMPRESSION_LZW, + "zstd": COMPRESSION_ZSTD, + "packbits": COMPRESSION_PACKBITS, +} + + +# =========================================================================== # +# Section: Local read (reader.local_file, stable) # +# =========================================================================== # +# +# The most basic promise the GeoTIFF module makes to a user: ``open_geotiff`` +# reads a local GeoTIFF and the result carries the pixels, the CRS, the +# transform, and the nodata sentinel from the file. + +# A tiny axis-aligned grid is enough to lock the contract. The distinctive +# per-row pattern means a single-axis drift in the writer or reader still +# fails the equality check. +_LOCAL_READ_PIXELS = np.array( + [ + [10.0, 20.0, 30.0, 40.0], + [11.0, 21.0, 31.0, 41.0], + [12.0, 22.0, 32.0, 42.0], + [13.0, 23.0, 33.0, 43.0], + ], + dtype=np.float32, +) +_LOCAL_READ_EPSG = 3857 +_LOCAL_READ_ORIGIN_X = 500000.0 +_LOCAL_READ_ORIGIN_Y = 4000000.0 +_LOCAL_READ_PIXEL_W = 30.0 +_LOCAL_READ_PIXEL_H = -30.0 +_LOCAL_READ_EXPECTED_TRANSFORM = ( + _LOCAL_READ_PIXEL_W, 0.0, _LOCAL_READ_ORIGIN_X, + 0.0, _LOCAL_READ_PIXEL_H, _LOCAL_READ_ORIGIN_Y, +) + + +def _local_read_write_known_good( + path: str, *, nodata: float | None = None, +) -> None: + """Write a known-good GeoTIFF using the explicit ``geo_transform`` path.""" + gt = GeoTransform( + origin_x=_LOCAL_READ_ORIGIN_X, + origin_y=_LOCAL_READ_ORIGIN_Y, + pixel_width=_LOCAL_READ_PIXEL_W, + pixel_height=_LOCAL_READ_PIXEL_H, + ) + write( + _LOCAL_READ_PIXELS, + path, + geo_transform=gt, + crs_epsg=_LOCAL_READ_EPSG, + nodata=nodata, + compression="none", + tiled=False, + ) + + +@pytest.mark.release_gate +def test_release_gate_local_read_pixels(tmp_path) -> None: + """Pixel bytes survive the read.""" + path = str(tmp_path / "release_gate_local_read.tif") + _local_read_write_known_good(path) + + da = open_geotiff(path) + assert da.dtype == np.float32, ( + f"release gate: local read promoted dtype to {da.dtype!r}; the " + "release contract is that float32 stays float32 unless a " + "nodata sentinel forces promotion" + ) + np.testing.assert_array_equal( + np.asarray(da.values), + _LOCAL_READ_PIXELS, + err_msg=( + "release gate: local read returned different pixels than the " + "writer emitted; the byte-for-byte round trip is the most " + "basic promise the release notes make" + ), + ) + + +@pytest.mark.release_gate +def test_release_gate_local_read_crs(tmp_path) -> None: + """``attrs['crs']`` round-trips as the source EPSG.""" + path = str(tmp_path / "release_gate_local_read_crs.tif") + _local_read_write_known_good(path) + + da = open_geotiff(path) + crs = da.attrs.get("crs") + assert crs is not None, ( + "release gate: local read dropped ``attrs['crs']``; the release " + "contract promises that an EPSG-coded source surfaces its CRS" + ) + assert int(crs) == _LOCAL_READ_EPSG, ( + f"release gate: ``attrs['crs']`` drifted from {_LOCAL_READ_EPSG} " + f"to {crs!r}; this changes the release notes contract for " + "``reader.local_file``" + ) + + +@pytest.mark.release_gate +def test_release_gate_local_read_transform(tmp_path) -> None: + """``attrs['transform']`` is the 6-tuple GeoTransform the file carried.""" + path = str(tmp_path / "release_gate_local_read_transform.tif") + _local_read_write_known_good(path) + + da = open_geotiff(path) + transform = da.attrs.get("transform") + assert transform is not None, ( + "release gate: local read dropped ``attrs['transform']``; the " + "release contract promises a 6-tuple GeoTransform on every " + "georeferenced read" + ) + assert len(transform) == 6, ( + f"release gate: transform tuple is no longer length 6: " + f"{transform!r}; release notes promise the rasterio-style 6-tuple" + ) + for got, want in zip(transform, _LOCAL_READ_EXPECTED_TRANSFORM): + # Floats compared at float precision because the writer encodes the + # transform as doubles in the GeoTIFF tags. + assert got == pytest.approx(want, abs=1e-12, rel=1e-12), ( + f"release gate: transform tuple drifted: got {transform!r} " + f"want {_LOCAL_READ_EXPECTED_TRANSFORM!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_local_read_nodata(tmp_path) -> None: + """``attrs['nodata']`` reflects the on-disk sentinel.""" + path = str(tmp_path / "release_gate_local_read_nodata.tif") + sentinel = -9999.0 + _local_read_write_known_good(path, nodata=sentinel) + + da = open_geotiff(path) + nodata = da.attrs.get("nodata") + assert nodata is not None, ( + "release gate: declared nodata sentinel was dropped on read; " + "the release contract promises that a declared sentinel " + "surfaces in ``attrs['nodata']``" + ) + assert float(nodata) == pytest.approx(sentinel, abs=0.0), ( + f"release gate: ``attrs['nodata']`` drifted from {sentinel} to " + f"{nodata!r}" + ) + + +# =========================================================================== # +# Section: Local write (writer.local_file, stable) # +# =========================================================================== # + +def _local_write_make_data_array( + *, nodata: float | None = None, +) -> xr.DataArray: + """Build a small DataArray with explicit y/x coords for the writer gate.""" + pixels = np.array( + [ + [1.0, 2.0, 3.0, 4.0], + [5.0, 6.0, 7.0, 8.0], + [9.0, 10.0, 11.0, 12.0], + [13.0, 14.0, 15.0, 16.0], + ], + dtype=np.float32, + ) + y = np.array([3999985.0, 3999955.0, 3999925.0, 3999895.0]) + x = np.array([500015.0, 500045.0, 500075.0, 500105.0]) + attrs: dict = {"crs": 32610} + if nodata is not None: + attrs["nodata"] = nodata + return xr.DataArray( + pixels, + dims=("y", "x"), + coords={"y": y, "x": x}, + attrs=attrs, + ) + + +@pytest.mark.release_gate +def test_release_gate_local_write_round_trips_pixels(tmp_path) -> None: + """``to_geotiff`` writes a file that reads back bit-exact.""" + da = _local_write_make_data_array() + path = str(tmp_path / "release_gate_local_write_pixels.tif") + to_geotiff(da, path, compression="none", tiled=False) + + out = open_geotiff(path) + assert out.dtype == np.float32, ( + f"release gate: write -> read flipped dtype to {out.dtype!r}; " + "the release contract promises float32 stays float32 absent a " + "nodata sentinel" + ) + np.testing.assert_array_equal( + np.asarray(out.values), + np.asarray(da.values), + err_msg=( + "release gate: write -> read changed pixel values; " + "to_geotiff is promised to be lossless for the default " + "'none' codec" + ), + ) + + +@pytest.mark.release_gate +def test_release_gate_local_write_preserves_crs(tmp_path) -> None: + """The CRS survives the write -> read round trip.""" + da = _local_write_make_data_array() + path = str(tmp_path / "release_gate_local_write_crs.tif") + to_geotiff(da, path, compression="none", tiled=False) + + out = open_geotiff(path) + crs = out.attrs.get("crs") + assert crs is not None, ( + "release gate: write -> read dropped ``attrs['crs']``; the " + "release contract requires the CRS to survive" + ) + assert int(crs) == 32610, ( + f"release gate: ``attrs['crs']`` drifted from 32610 to {crs!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_local_write_preserves_transform(tmp_path) -> None: + """The GeoTransform survives the write -> read round trip.""" + da = _local_write_make_data_array() + path = str(tmp_path / "release_gate_local_write_transform.tif") + to_geotiff(da, path, compression="none", tiled=False) + + out = open_geotiff(path) + transform = out.attrs.get("transform") + assert transform is not None, ( + "release gate: write -> read dropped ``attrs['transform']``; " + "the release contract requires the GeoTransform to survive" + ) + assert len(transform) == 6, ( + f"release gate: transform tuple is no longer length 6: " + f"{transform!r}" + ) + assert transform[0] == pytest.approx(30.0, abs=1e-9), ( + f"release gate: pixel_width drifted: {transform!r}" + ) + assert transform[4] == pytest.approx(-30.0, abs=1e-9), ( + f"release gate: pixel_height sign or magnitude drifted: " + f"{transform!r}" + ) + assert transform[1] == 0.0 and transform[3] == 0.0, ( + f"release gate: shear terms appeared in axis-aligned write: " + f"{transform!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_local_write_preserves_nodata(tmp_path) -> None: + """A declared nodata sentinel survives the write -> read round trip.""" + sentinel = -9999.0 + da = _local_write_make_data_array(nodata=sentinel) + path = str(tmp_path / "release_gate_local_write_nodata.tif") + to_geotiff(da, path, compression="none", tiled=False, nodata=sentinel) + + out = open_geotiff(path) + nodata = out.attrs.get("nodata") + assert nodata is not None, ( + "release gate: declared nodata was dropped on write -> read; " + "the release contract promises the sentinel survives" + ) + assert float(nodata) == pytest.approx(sentinel, abs=0.0), ( + f"release gate: ``attrs['nodata']`` drifted from {sentinel} to " + f"{nodata!r}" + ) + + +# =========================================================================== # +# Section: Codecs -- stable lossless set # +# =========================================================================== # +# +# The release contract names a specific set of codecs as ``stable``: +# ``none``, ``deflate``, ``lzw``, ``packbits``, ``zstd``. Every one must +# round-trip pixels byte-for-byte on integer and float dtypes. + +def _codecs_gt() -> GeoTransform: + return GeoTransform( + origin_x=500000.0, + origin_y=4000000.0, + pixel_width=30.0, + pixel_height=-30.0, + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) +def test_release_gate_codec_round_trip_uint16(tmp_path, codec) -> None: + """Integer pixel bytes survive every stable lossless codec.""" + arr = np.arange(64, dtype=np.uint16).reshape(8, 8) + path = str(tmp_path / f"release_gate_codec_{codec}_uint16.tif") + write( + arr, + path, + geo_transform=_codecs_gt(), + crs_epsg=32610, + compression=codec, + tiled=False, + ) + + out = open_geotiff(path) + assert out.dtype == np.uint16, ( + f"release gate: codec {codec!r} promoted uint16 to {out.dtype!r}; " + "the lossless contract is that integer dtypes survive every " + "stable codec" + ) + np.testing.assert_array_equal( + np.asarray(out.values), + arr, + err_msg=( + f"release gate: codec {codec!r} did not round-trip uint16 " + "pixels byte-for-byte; the release contract names this codec " + "as lossless" + ), + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) +def test_release_gate_codec_round_trip_float32(tmp_path, codec) -> None: + """Float pixel bytes survive every stable lossless codec.""" + arr = np.linspace(-100.0, 100.0, 64, dtype=np.float32).reshape(8, 8) + path = str(tmp_path / f"release_gate_codec_{codec}_float32.tif") + write( + arr, + path, + geo_transform=_codecs_gt(), + crs_epsg=32610, + compression=codec, + tiled=False, + ) + + out = open_geotiff(path) + assert out.dtype == np.float32, ( + f"release gate: codec {codec!r} promoted float32 to " + f"{out.dtype!r}" + ) + np.testing.assert_array_equal( + np.asarray(out.values), + arr, + err_msg=( + f"release gate: codec {codec!r} did not round-trip float32 " + "pixels byte-for-byte; the release contract names this codec " + "as lossless" + ), + ) + + +@pytest.mark.release_gate +def test_release_gate_codec_stable_set_matches_supported_features() -> None: + """The stable codec list matches ``SUPPORTED_FEATURES``. + + If a codec is promoted into ``stable`` (or demoted out) in + :data:`xrspatial.geotiff.SUPPORTED_FEATURES` without updating this + file, the release gate is out of sync with the runtime contract. + Fail loudly here so the PR that changes the tier also updates the + gate. + """ + stable_from_constant = { + key.split(".", 1)[1] + for key, tier in SUPPORTED_FEATURES.items() + if key.startswith("codec.") and tier == "stable" + } + assert stable_from_constant == set(STABLE_LOSSLESS_CODECS), ( + "release gate: STABLE_LOSSLESS_CODECS drifted from " + "SUPPORTED_FEATURES; the gate and the runtime tier table must " + "agree on which codecs are stable. " + f"constant: {set(STABLE_LOSSLESS_CODECS)!r}; " + f"SUPPORTED_FEATURES: {stable_from_constant!r}" + ) + + +# =========================================================================== # +# Section: COG (writer.cog / reader.local_cog, stable) # +# =========================================================================== # +# +# The release contract: ``to_geotiff(cog=True, compression=)`` writes a file ``open_geotiff`` reads back bit-exact, with +# CRS, transform, and (when declared) nodata preserved across every +# stable codec. + +_COG_W = 32 +_COG_H = 32 + + +def _cog_make_data_array(*, nodata: float | None = None) -> xr.DataArray: + pixels = np.arange(_COG_H * _COG_W, dtype=np.float32).reshape(_COG_H, _COG_W) + # Pixel-center coords, 30 m pixels, top-left at (500000, 4000000). + y = np.array( + [4000000.0 - 15.0 - 30.0 * i for i in range(_COG_H)], + dtype=np.float64, + ) + x = np.array( + [500000.0 + 15.0 + 30.0 * i for i in range(_COG_W)], + dtype=np.float64, + ) + attrs: dict = {"crs": 32610} + if nodata is not None: + attrs["nodata"] = nodata + return xr.DataArray( + pixels, + dims=("y", "x"), + coords={"y": y, "x": x}, + attrs=attrs, + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) +def test_release_gate_cog_round_trips_pixels(tmp_path, codec) -> None: + """COG write -> read returns the same pixels under every stable codec.""" + da = _cog_make_data_array() + path = str(tmp_path / f"release_gate_cog_{codec}_pixels.tif") + to_geotiff( + da, + path, + compression=codec, + cog=True, + tiled=True, + tile_size=16, + ) + + out = open_geotiff(path) + assert out.dtype == np.float32, ( + f"release gate: COG with codec {codec!r} promoted dtype to " + f"{out.dtype!r}" + ) + np.testing.assert_array_equal( + np.asarray(out.values), + np.asarray(da.values), + err_msg=( + f"release gate: COG with codec {codec!r} did not round-trip " + "pixels byte-for-byte" + ), + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) +def test_release_gate_cog_preserves_crs_transform(tmp_path, codec) -> None: + """CRS and transform survive the COG write -> read for every stable codec.""" + da = _cog_make_data_array() + path = str(tmp_path / f"release_gate_cog_{codec}_attrs.tif") + to_geotiff( + da, + path, + compression=codec, + cog=True, + tiled=True, + tile_size=16, + ) + + out = open_geotiff(path) + crs = out.attrs.get("crs") + assert crs is not None and int(crs) == 32610, ( + f"release gate: COG with codec {codec!r} dropped or drifted " + f"``attrs['crs']``: got {crs!r}" + ) + transform = out.attrs.get("transform") + assert transform is not None and len(transform) == 6, ( + f"release gate: COG with codec {codec!r} dropped or reshaped " + f"``attrs['transform']``: got {transform!r}" + ) + assert transform[0] == pytest.approx(30.0, abs=1e-9), ( + f"release gate: COG pixel_width drifted under {codec!r}: " + f"{transform!r}" + ) + assert transform[4] == pytest.approx(-30.0, abs=1e-9), ( + f"release gate: COG pixel_height drifted under {codec!r}: " + f"{transform!r}" + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) +def test_release_gate_cog_preserves_nodata(tmp_path, codec) -> None: + """A declared nodata sentinel survives COG write -> read under every codec.""" + sentinel = -9999.0 + da = _cog_make_data_array(nodata=sentinel) + path = str(tmp_path / f"release_gate_cog_{codec}_nodata.tif") + to_geotiff( + da, + path, + compression=codec, + nodata=sentinel, + cog=True, + tiled=True, + tile_size=16, + ) + + out = open_geotiff(path) + nodata = out.attrs.get("nodata") + assert nodata is not None, ( + f"release gate: COG with codec {codec!r} dropped declared nodata" + ) + assert float(nodata) == pytest.approx(sentinel, abs=0.0), ( + f"release gate: COG with codec {codec!r} drifted nodata from " + f"{sentinel} to {nodata!r}" + ) + + +# =========================================================================== # +# Section: Windowed read (reader.windowed, stable) # +# =========================================================================== # +# +# A ``(row_start, col_start, row_stop, col_stop)`` window returns the +# exact subset of source pixels, preserves CRS, and shifts the +# transform origin to the window's top-left pixel corner. + +_WINDOWED_H = 10 +_WINDOWED_W = 10 +_WINDOWED_PIXELS = ( + np.arange(_WINDOWED_H, dtype=np.int32).reshape(-1, 1) * 100 + + np.arange(_WINDOWED_W, dtype=np.int32).reshape(1, -1) +).astype(np.int32) +_WINDOWED_ORIGIN_X = 500000.0 +_WINDOWED_ORIGIN_Y = 4000000.0 +_WINDOWED_PIXEL_W = 30.0 +_WINDOWED_PIXEL_H = -30.0 + + +def _windowed_write_known_good(path: str) -> None: + gt = GeoTransform( + origin_x=_WINDOWED_ORIGIN_X, + origin_y=_WINDOWED_ORIGIN_Y, + pixel_width=_WINDOWED_PIXEL_W, + pixel_height=_WINDOWED_PIXEL_H, + ) + write( + _WINDOWED_PIXELS, + path, + geo_transform=gt, + crs_epsg=32610, + compression="none", + tiled=False, + ) + + +@pytest.mark.release_gate +def test_release_gate_windowed_read_returns_subset(tmp_path) -> None: + """A windowed read returns exactly the requested subset.""" + path = str(tmp_path / "release_gate_windowed_read_subset.tif") + _windowed_write_known_good(path) + + row_start, col_start = 2, 3 + row_stop, col_stop = 6, 8 + out = open_geotiff(path, window=(row_start, col_start, row_stop, col_stop)) + + expected = _WINDOWED_PIXELS[row_start:row_stop, col_start:col_stop] + assert out.shape == expected.shape, ( + f"release gate: windowed read shape {out.shape} does not match " + f"the requested window shape {expected.shape}" + ) + np.testing.assert_array_equal( + np.asarray(out.values), + expected, + err_msg=( + "release gate: windowed read returned different pixels than " + "the same rows / cols of the source array; this would silently " + "break every downstream caller that relies on window= for " + "subsetting" + ), + ) + + +@pytest.mark.release_gate +def test_release_gate_windowed_read_preserves_crs(tmp_path) -> None: + """A windowed read carries ``attrs['crs']`` over from the source.""" + path = str(tmp_path / "release_gate_windowed_read_crs.tif") + _windowed_write_known_good(path) + + out = open_geotiff(path, window=(1, 1, 5, 5)) + crs = out.attrs.get("crs") + assert crs is not None and int(crs) == 32610, ( + f"release gate: windowed read dropped or drifted " + f"``attrs['crs']``: got {crs!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_windowed_read_shifts_transform_origin(tmp_path) -> None: + """The transform origin shifts to the window's top-left pixel. + + For a window starting at ``(row, col) = (2, 3)`` on a grid with pixel + width ``+30`` and pixel height ``-30``, the new origin is + ``(origin_x + 3 * 30, origin_y + 2 * -30)``. + """ + path = str(tmp_path / "release_gate_windowed_read_transform.tif") + _windowed_write_known_good(path) + + row_start, col_start = 2, 3 + out = open_geotiff(path, window=(row_start, col_start, 6, 8)) + transform = out.attrs.get("transform") + assert transform is not None and len(transform) == 6, ( + f"release gate: windowed read dropped or reshaped transform: " + f"{transform!r}" + ) + assert transform[0] == pytest.approx(_WINDOWED_PIXEL_W, abs=1e-9), ( + f"release gate: windowed read changed pixel_width: {transform!r}" + ) + assert transform[4] == pytest.approx(_WINDOWED_PIXEL_H, abs=1e-9), ( + f"release gate: windowed read changed pixel_height: {transform!r}" + ) + expected_origin_x = _WINDOWED_ORIGIN_X + col_start * _WINDOWED_PIXEL_W + expected_origin_y = _WINDOWED_ORIGIN_Y + row_start * _WINDOWED_PIXEL_H + assert transform[2] == pytest.approx(expected_origin_x, abs=1e-9), ( + f"release gate: windowed read origin_x did not shift to the " + f"window's left edge: got {transform[2]!r} expected " + f"{expected_origin_x!r}" + ) + assert transform[5] == pytest.approx(expected_origin_y, abs=1e-9), ( + f"release gate: windowed read origin_y did not shift to the " + f"window's top edge: got {transform[5]!r} expected " + f"{expected_origin_y!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_windowed_read_full_extent_matches_unwindowed( + tmp_path, +) -> None: + """``window=(0, 0, H, W)`` returns the same pixels as no window.""" + path = str(tmp_path / "release_gate_windowed_read_full.tif") + _windowed_write_known_good(path) + + full = open_geotiff(path) + windowed = open_geotiff(path, window=(0, 0, _WINDOWED_H, _WINDOWED_W)) + assert windowed.shape == full.shape, ( + f"release gate: full-extent window shape drift: " + f"{windowed.shape} vs {full.shape}" + ) + np.testing.assert_array_equal( + np.asarray(windowed.values), + np.asarray(full.values), + err_msg=( + "release gate: full-extent window returned different pixels " + "than the unwindowed read" + ), + ) + + +# =========================================================================== # +# Section: Dask parity (reader.dask, stable) # +# =========================================================================== # +# +# Dask reads of a local GeoTIFF must return the same pixels and canonical +# attrs as the eager (numpy) read. The small one-shot gate below is the +# release-engineer-facing test; the wider parity matrix lives elsewhere. + +pytest.importorskip("dask") + + +def _dask_parity_write_known_good(path: str) -> np.ndarray: + """Write a small tiled GeoTIFF and return the source array.""" + arr = np.arange(256, dtype=np.float32).reshape(16, 16) + gt = GeoTransform( + origin_x=500000.0, + origin_y=4000000.0, + pixel_width=30.0, + pixel_height=-30.0, + ) + write( + arr, + path, + geo_transform=gt, + crs_epsg=32610, + compression="deflate", + tiled=True, + tile_size=16, + ) + return arr + + +@pytest.mark.release_gate +def test_release_gate_dask_read_matches_eager_pixels(tmp_path) -> None: + """The dask backend returns the same pixels as the eager backend.""" + path = str(tmp_path / "release_gate_dask_parity_pixels.tif") + _dask_parity_write_known_good(path) + + eager = open_geotiff(path) + lazy = open_geotiff(path, chunks=8) + + lazy_values = np.asarray(lazy.values) + eager_values = np.asarray(eager.values) + np.testing.assert_array_equal( + lazy_values, + eager_values, + err_msg=( + "release gate: dask backend returned different pixels than " + "the eager backend; the release contract promises dask read " + "parity for the local-file stable path" + ), + ) + assert lazy.dtype == eager.dtype, ( + f"release gate: dask backend changed dtype from {eager.dtype!r} " + f"to {lazy.dtype!r}" + ) + assert lazy.shape == eager.shape, ( + f"release gate: dask backend changed shape from {eager.shape!r} " + f"to {lazy.shape!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_dask_read_matches_eager_attrs(tmp_path) -> None: + """The dask backend produces the same canonical attrs as eager.""" + path = str(tmp_path / "release_gate_dask_parity_attrs.tif") + _dask_parity_write_known_good(path) + + eager = open_geotiff(path) + lazy = open_geotiff(path, chunks=8) + + canonical = ("crs", "transform", "georef_status") + for key in canonical: + assert key in eager.attrs, ( + f"release gate: eager read is missing canonical attr " + f"{key!r}; cannot compare backends" + ) + assert key in lazy.attrs, ( + f"release gate: dask read is missing canonical attr " + f"{key!r}; the release contract requires backend parity on " + "canonical attrs" + ) + eager_v = eager.attrs[key] + lazy_v = lazy.attrs[key] + if key == "transform": + assert len(eager_v) == len(lazy_v) == 6 + for a, b in zip(eager_v, lazy_v): + assert a == pytest.approx(b, abs=1e-12, rel=1e-12), ( + f"release gate: transform drifted across backends: " + f"eager={eager_v!r} lazy={lazy_v!r}" + ) + else: + assert eager_v == lazy_v, ( + f"release gate: ``attrs[{key!r}]`` drifted across " + f"backends: eager={eager_v!r} lazy={lazy_v!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_dask_read_is_lazy(tmp_path) -> None: + """A ``chunks=`` read produces a dask-backed DataArray. + + Without this assertion, a regression that silently materialised the + dask path into numpy could pass the pixel-parity test above without + anyone noticing. The dask backend's defining property is laziness; + pin it. + """ + import dask.array as da_mod + + path = str(tmp_path / "release_gate_dask_parity_lazy.tif") + _dask_parity_write_known_good(path) + + lazy = open_geotiff(path, chunks=8) + assert isinstance(lazy.data, da_mod.Array), ( + f"release gate: chunks= read returned a non-dask array of type " + f"{type(lazy.data).__name__}; the release contract promises a " + "dask-backed DataArray when chunks= is set" + ) + + +# =========================================================================== # +# Section: Eager / dask full parity (epic #2341) # +# =========================================================================== # +# +# Pixels matching while ``attrs``, ``coords``, or ``dims`` silently +# disagree between the eager and dask paths is the highest release risk +# for the GeoTIFF surface. This block reads each fixture in a small +# representative corpus once through ``open_geotiff`` and once through +# ``read_geotiff_dask``, then asserts full raster equivalence. + +# Corpus fixtures live under ``golden_corpus/fixtures``. +_EAGER_DASK_FIXTURES_DIR = ( + Path(__file__).resolve().parents[1] / "golden_corpus" / "fixtures" +) +_EAGER_DASK_CHUNK_SIZE = 32 + +# The seven release-attr keys the parity contract pins. +_EAGER_DASK_RELEASE_ATTR_KEYS: tuple[str, ...] = ( + "transform", + "crs", + "crs_wkt", + "nodata", + "masked_nodata", + "georef_status", + "raster_type", +) + +_EAGER_DASK_CORPUS = [ + pytest.param( + "nodata_int_sentinel_uint16", {}, id="int-dtype-nodata", + ), + pytest.param( + "nodata_nan_float32", {}, id="float-dtype-nan-nodata", + ), + pytest.param( + "nodata_miniswhite_uint8", {}, id="miniswhite", + ), + pytest.param( + "nodata_int_sentinel_uint16", + {"mask_nodata": False}, + id="masked-nodata-lifecycle", + ), +] + + +def _eager_dask_materialise(da: xr.DataArray) -> np.ndarray: + return np.asarray(da.values) + + +def _eager_dask_assert_values_equal( + eager: xr.DataArray, lazy: xr.DataArray, +) -> None: + assert eager.dtype == lazy.dtype, ( + f"pixel dtype differs: eager={eager.dtype} lazy={lazy.dtype}" + ) + eager_px = _eager_dask_materialise(eager) + lazy_px = _eager_dask_materialise(lazy) + assert eager_px.shape == lazy_px.shape, ( + f"pixel shape differs: eager={eager_px.shape} lazy={lazy_px.shape}" + ) + equal_nan = eager_px.dtype.kind == "f" + if not np.array_equal(eager_px, lazy_px, equal_nan=equal_nan): + raise AssertionError( + "pixel values differ between eager and dask reads " + f"(dtype={eager_px.dtype}, equal_nan={equal_nan})" + ) + + +def _eager_dask_assert_dims_equal( + eager: xr.DataArray, lazy: xr.DataArray, +) -> None: + assert eager.dims == lazy.dims, ( + f"dims differ: eager={eager.dims!r} lazy={lazy.dims!r}" + ) + + +def _eager_dask_assert_coords_equal( + eager: xr.DataArray, lazy: xr.DataArray, +) -> None: + eager_coord_names = set(eager.coords) + lazy_coord_names = set(lazy.coords) + assert eager_coord_names == lazy_coord_names, ( + f"coord name set differs: " + f"only-in-eager={sorted(eager_coord_names - lazy_coord_names)} " + f"only-in-lazy={sorted(lazy_coord_names - eager_coord_names)}" + ) + for axis in eager_coord_names: + eager_c = np.asarray(eager.coords[axis].values) + lazy_c = np.asarray(lazy.coords[axis].values) + assert eager_c.dtype == lazy_c.dtype, ( + f"coord {axis!r} dtype differs: " + f"eager={eager_c.dtype} lazy={lazy_c.dtype}" + ) + assert eager_c.shape == lazy_c.shape, ( + f"coord {axis!r} shape differs: " + f"eager={eager_c.shape} lazy={lazy_c.shape}" + ) + assert eager_c.tobytes() == lazy_c.tobytes(), ( + f"coord {axis!r} bytes differ between eager and dask reads" + ) + + +def _eager_dask_is_nan_sentinel(value: Any) -> bool: + if value is None: + return False + try: + return bool(np.isnan(float(value))) + except (TypeError, ValueError): + return False + + +def _eager_dask_attr_equal(a: Any, b: Any) -> bool: + if _eager_dask_is_nan_sentinel(a) and _eager_dask_is_nan_sentinel(b): + return True + if isinstance(a, np.ndarray) or isinstance(b, np.ndarray): + return ( + isinstance(a, np.ndarray) + and isinstance(b, np.ndarray) + and np.array_equal(a, b) + ) + if isinstance(a, (tuple, list)) and isinstance(b, (tuple, list)): + if len(a) != len(b): + return False + return all(_eager_dask_attr_equal(x, y) for x, y in zip(a, b)) + return a == b + + +def _eager_dask_assert_release_attrs_equal( + eager: xr.DataArray, lazy: xr.DataArray, +) -> None: + for key in _EAGER_DASK_RELEASE_ATTR_KEYS: + in_eager = key in eager.attrs + in_lazy = key in lazy.attrs + assert in_eager == in_lazy, ( + f"release attr {key!r} presence differs: " + f"eager={in_eager} lazy={in_lazy}" + ) + if not in_eager: + continue + eager_v = eager.attrs[key] + lazy_v = lazy.attrs[key] + assert _eager_dask_attr_equal(eager_v, lazy_v), ( + f"release attr {key!r} value differs: " + f"eager={eager_v!r} lazy={lazy_v!r}" + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("fixture_id, open_kwargs", _EAGER_DASK_CORPUS) +def test_release_gate_eager_dask_full_parity( + fixture_id: str, open_kwargs: dict, +) -> None: + """Eager and dask reads of the same file agree on the full contract.""" + path = _EAGER_DASK_FIXTURES_DIR / f"{fixture_id}.tif" + if not path.exists(): + pytest.skip( + f"fixture {fixture_id!r} has no .tif on disk; run " + f"`python -m xrspatial.geotiff.tests.golden_corpus.generate`" + ) + + eager = open_geotiff(str(path), **open_kwargs) + lazy = read_geotiff_dask( + str(path), chunks=_EAGER_DASK_CHUNK_SIZE, **open_kwargs, + ) + + _eager_dask_assert_values_equal(eager, lazy) + _eager_dask_assert_dims_equal(eager, lazy) + _eager_dask_assert_coords_equal(eager, lazy) + _eager_dask_assert_release_attrs_equal(eager, lazy) + + +@pytest.mark.release_gate +def test_release_gate_corpus_is_non_empty() -> None: + """The corpus list must not silently shrink to zero rows.""" + assert len(_EAGER_DASK_CORPUS) == 4, ( + f"corpus row count drifted: expected 4 scenarios " + f"(int-nodata, float-nan-nodata, miniswhite, " + f"masked-nodata-lifecycle), got {len(_EAGER_DASK_CORPUS)}" + ) + + +# =========================================================================== # +# Section: Attrs contract (canonical attrs, stable) # +# =========================================================================== # +# +# Every georeferenced read produces a DataArray whose ``attrs`` carry, at +# minimum, ``crs``, ``crs_wkt``, ``transform``, ``georef_status``, the +# contract version stamp, and (when declared) ``nodata``. These attrs +# survive a write -> read round trip. + +_ATTRS_CANONICAL_KEYS = ( + "_xrspatial_geotiff_contract", + "crs", + "crs_wkt", + "transform", + "georef_status", +) + + +def _attrs_write_known_good( + path: str, *, nodata: float | None = None, +) -> None: + arr = np.arange(16, dtype=np.float32).reshape(4, 4) + gt = GeoTransform( + origin_x=500000.0, + origin_y=4000000.0, + pixel_width=30.0, + pixel_height=-30.0, + ) + write( + arr, + path, + geo_transform=gt, + crs_epsg=32610, + nodata=nodata, + compression="none", + tiled=False, + ) + + +@pytest.mark.release_gate +def test_release_gate_attrs_canonical_keys_present(tmp_path) -> None: + """A georeferenced read carries every canonical attrs key.""" + path = str(tmp_path / "release_gate_attrs_canonical.tif") + _attrs_write_known_good(path) + + da = open_geotiff(path) + missing = [k for k in _ATTRS_CANONICAL_KEYS if k not in da.attrs] + assert not missing, ( + "release gate: canonical attrs keys missing from a georeferenced " + f"read: {missing}; release notes promise every key in " + f"{list(_ATTRS_CANONICAL_KEYS)}" + ) + + +@pytest.mark.release_gate +def test_release_gate_attrs_georef_status_full(tmp_path) -> None: + """A fully-georeferenced read reports ``georef_status='full'``.""" + path = str(tmp_path / "release_gate_attrs_georef_status.tif") + _attrs_write_known_good(path) + + da = open_geotiff(path) + status = da.attrs.get("georef_status") + assert status == "full", ( + f"release gate: a CRS+transform read should report " + f"``georef_status='full'``; got {status!r}. The five canonical " + "georef_status values are the contract downstream code branches on" + ) + + +@pytest.mark.release_gate +def test_release_gate_attrs_contract_version_is_int(tmp_path) -> None: + """``attrs['_xrspatial_geotiff_contract']`` is an int. + + The contract version is the downstream signal for which attrs shape + the array carries. A drift from int to string (or to a Python object) + would silently break callers that compare versions. + """ + path = str(tmp_path / "release_gate_attrs_contract_version.tif") + _attrs_write_known_good(path) + + da = open_geotiff(path) + version = da.attrs.get("_xrspatial_geotiff_contract") + assert isinstance(version, int), ( + f"release gate: contract version stamp is not int: type=" + f"{type(version).__name__}, value={version!r}" + ) + assert version >= 1, ( + f"release gate: contract version stamp is non-positive: {version!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_attrs_round_trip_preserves_crs_transform_nodata( + tmp_path, +) -> None: + """Canonical attrs survive ``write -> read -> write -> read``.""" + src = str(tmp_path / "release_gate_attrs_rt_src.tif") + _attrs_write_known_good(src, nodata=-9999.0) + + first = open_geotiff(src) + crs_first = int(first.attrs["crs"]) + transform_first = tuple(first.attrs["transform"]) + nodata_first = float(first.attrs["nodata"]) + + rewrite = str(tmp_path / "release_gate_attrs_rt_rewrite.tif") + to_geotiff(first, rewrite, compression="none", tiled=False) + + second = open_geotiff(rewrite) + assert int(second.attrs["crs"]) == crs_first, ( + f"release gate: CRS drifted across round-trip: {crs_first} -> " + f"{second.attrs['crs']!r}" + ) + transform_second = tuple(second.attrs["transform"]) + assert len(transform_second) == 6, ( + f"release gate: transform reshaped across round-trip: " + f"{transform_second!r}" + ) + for got, want in zip(transform_second, transform_first): + assert got == pytest.approx(want, abs=1e-12, rel=1e-12), ( + f"release gate: transform drifted across round-trip: " + f"{transform_first!r} -> {transform_second!r}" + ) + assert float(second.attrs["nodata"]) == pytest.approx( + nodata_first, abs=0.0, + ), ( + f"release gate: nodata drifted across round-trip: " + f"{nodata_first} -> {second.attrs['nodata']!r}" + ) + + +# =========================================================================== # +# Section: Codec round-trip (cartesian stable codec x dtype, epic #2341) # +# =========================================================================== # +# +# Cartesian product of every stable codec with every promised dtype, +# asserting both pixel equality AND release-attr equality through a full +# read/write/read cycle. + +_ROUND_TRIP_DTYPES = ("int16", "int32", "float32", "float64") + +_ROUND_TRIP_INT_NODATA = { + "int16": np.int16(-32768), + "int32": np.int32(-2147483648), +} + +_ROUND_TRIP_RELEASE_ATTR_KEYS = ( + "transform", + "crs", + "crs_wkt", + "nodata", + "masked_nodata", + "georef_status", + "raster_type", +) + + +def _round_trip_make_input(dtype_name: str) -> xr.DataArray: + """Build a 128x128 DataArray of the given dtype.""" + dtype = np.dtype(dtype_name) + height, width = 128, 128 + n = height * width + if np.issubdtype(dtype, np.floating): + arr = np.linspace(-100.0, 100.0, n, dtype=dtype).reshape(height, width) + arr[0, 0] = np.nan + nodata: float | int = float("nan") + else: + # Small positive ramp; the dtype min sentinel never collides with + # a real pixel for the supported int dtypes. + arr = np.arange(n, dtype=dtype).reshape(height, width) + sentinel = _ROUND_TRIP_INT_NODATA[dtype_name] + arr[0, 0] = sentinel + nodata = sentinel + + y = 4000000.0 - 30.0 * (np.arange(height) + 0.5) + x = 500000.0 + 30.0 * (np.arange(width) + 0.5) + attrs: dict = {"crs": 32610, "nodata": nodata} + return xr.DataArray( + arr, + dims=("y", "x"), + coords={"y": y, "x": x}, + attrs=attrs, + ) + + +def _round_trip_canonical_attrs(da: xr.DataArray) -> dict: + """Project ``attrs`` onto the release-attr key set.""" + out = {} + for key in _ROUND_TRIP_RELEASE_ATTR_KEYS: + if key == "raster_type": + out[key] = da.attrs.get("raster_type", "area") + else: + out[key] = da.attrs.get(key) + return out + + +def _round_trip_read_tiff_compression_tag(path: str) -> int: + """Read the on-disk TIFF Compression tag from the first IFD.""" + with open(path, "rb") as fh: + data = fh.read() + header = parse_header(data) + ifd = parse_ifd(data, header.first_ifd_offset, header) + return ifd.compression + + +def _round_trip_assert_pixels_equal( + actual: np.ndarray, expected: np.ndarray, *, + codec: str, dtype_name: str, +) -> None: + """NaN-aware byte-exact pixel comparison.""" + assert actual.shape == expected.shape, ( + f"release gate: codec {codec!r} dtype {dtype_name!r} reshaped the " + f"array across the round-trip: {expected.shape} -> {actual.shape}" + ) + assert actual.dtype == expected.dtype, ( + f"release gate: codec {codec!r} promoted dtype {dtype_name!r} to " + f"{actual.dtype!r} across the round-trip" + ) + if np.issubdtype(expected.dtype, np.floating): + equal = np.array_equal(actual, expected, equal_nan=True) + else: + equal = np.array_equal(actual, expected) + if not equal: + if np.issubdtype(expected.dtype, np.floating): + mismatch_mask = ~( + (actual == expected) | (np.isnan(actual) & np.isnan(expected)) + ) + else: + mismatch_mask = actual != expected + first = np.argwhere(mismatch_mask) + first_idx = tuple(int(v) for v in first[0]) if first.size else None + first_actual = ( + actual[first_idx] if first_idx is not None else None + ) + first_expected = ( + expected[first_idx] if first_idx is not None else None + ) + raise AssertionError( + f"release gate: codec {codec!r} did not preserve " + f"{dtype_name!r} pixels byte-for-byte; the release contract " + f"names this codec as lossless for this dtype. First " + f"divergence at index {first_idx!r}: actual=" + f"{first_actual!r}, expected={first_expected!r}" + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("dtype_name", _ROUND_TRIP_DTYPES) +@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) +def test_release_gate_codec_round_trip( + tmp_path, codec, dtype_name, +) -> None: + """Stable codec * dtype: pixels and release attrs survive read/write/read.""" + nonce = uuid.uuid4().hex[:8] + write_first = str( + tmp_path / f"release_gate_rt_{codec}_{dtype_name}_first_{nonce}.tif" + ) + write_second = str( + tmp_path / f"release_gate_rt_{codec}_{dtype_name}_second_{nonce}.tif" + ) + + source = _round_trip_make_input(dtype_name) + is_float = np.issubdtype(np.dtype(dtype_name), np.floating) + + mask_kwargs: dict = {} if is_float else {"mask_nodata": False} + + pass_one_kwargs: dict = ( + {} if is_float else {"nodata": source.attrs["nodata"]} + ) + to_geotiff( + source, + write_first, + compression=codec, + tiled=False, + **pass_one_kwargs, + ) + + baseline = open_geotiff(write_first, **mask_kwargs) + baseline_pixels = np.asarray(baseline.values) + baseline_attrs = _round_trip_canonical_attrs(baseline) + + tag_first = _round_trip_read_tiff_compression_tag(write_first) + assert tag_first == _CODEC_TO_TIFF_TAG[codec], ( + f"release gate: codec {codec!r} encoded as TIFF tag {tag_first} on " + f"first write; expected {_CODEC_TO_TIFF_TAG[codec]} per the codec " + f"-> tag map" + ) + + pass_two_kwargs: dict = ( + {} if is_float else {"nodata": baseline.attrs.get("nodata")} + ) + to_geotiff( + baseline, + write_second, + compression=codec, + tiled=False, + **pass_two_kwargs, + ) + + second = open_geotiff(write_second, **mask_kwargs) + second_pixels = np.asarray(second.values) + second_attrs = _round_trip_canonical_attrs(second) + + tag_second = _round_trip_read_tiff_compression_tag(write_second) + assert tag_second == _CODEC_TO_TIFF_TAG[codec], ( + f"release gate: codec {codec!r} encoded as TIFF tag {tag_second} on " + f"the second write; expected {_CODEC_TO_TIFF_TAG[codec]} per the " + f"codec -> tag map" + ) + + _round_trip_assert_pixels_equal( + second_pixels, baseline_pixels, codec=codec, dtype_name=dtype_name, + ) + + for key in _ROUND_TRIP_RELEASE_ATTR_KEYS: + want = baseline_attrs[key] + got = second_attrs[key] + if key == "nodata" and isinstance(want, float) and np.isnan(want): + assert isinstance(got, float) and np.isnan(got), ( + f"release gate: codec {codec!r} dtype {dtype_name!r} " + f"dropped NaN nodata across the round-trip: got {got!r}" + ) + continue + if key == "transform": + assert want is not None and got is not None, ( + f"release gate: codec {codec!r} dtype {dtype_name!r} " + f"dropped ``attrs['transform']``: {want!r} -> {got!r}" + ) + assert tuple(got) == tuple(want), ( + f"release gate: codec {codec!r} dtype {dtype_name!r} " + f"drifted ``attrs['transform']``: {want!r} -> {got!r}" + ) + continue + assert got == want, ( + f"release gate: codec {codec!r} dtype {dtype_name!r} drifted " + f"``attrs[{key!r}]`` across the round-trip: {want!r} -> {got!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_codec_round_trip_stable_set_matches_supported_features() -> None: + """The codec list used by the round-trip gate matches ``SUPPORTED_FEATURES``.""" + stable_from_constant = { + key.split(".", 1)[1] + for key, tier in SUPPORTED_FEATURES.items() + if key.startswith("codec.") and tier == "stable" + } + assert stable_from_constant == set(STABLE_LOSSLESS_CODECS), ( + "release gate: STABLE_LOSSLESS_CODECS drifted from " + "SUPPORTED_FEATURES; the gate and the runtime tier table must " + "agree on which codecs are stable. " + f"constant: {set(STABLE_LOSSLESS_CODECS)!r}; " + f"SUPPORTED_FEATURES: {stable_from_constant!r}" + ) + + +# =========================================================================== # +# Section: Overview / sidecar metadata survival (epic #2341) # +# =========================================================================== # +# +# For an internal-overview COG and for a file whose overviews live in an +# external ``.ovr`` sidecar, per-level ``attrs`` agree on the canonical +# metadata set, and ``transform`` scales pixel size by the level factor +# while keeping the origin fixed. + +rasterio = pytest.importorskip("rasterio") +pytest.importorskip("dask.array") + +from rasterio.enums import Resampling # noqa: E402 + +_OVERVIEW_BASE_SIZE = 64 +_OVERVIEW_FACTORS = (2, 4) +_OVERVIEW_BASE_TRANSFORM = (1.0, 0.0, -120.0, 0.0, -1.0, 45.0) +_OVERVIEW_BASE_CRS = 4326 +_OVERVIEW_NODATA = -9999.0 + +_OVERVIEW_EQUAL_KEYS = ( + "crs", + "crs_wkt", + "georef_status", + "raster_type", + "nodata", + "masked_nodata", +) + + +def _overview_make_raster() -> xr.DataArray: + arr = np.arange( + _OVERVIEW_BASE_SIZE * _OVERVIEW_BASE_SIZE, + dtype=np.float32, + ).reshape(_OVERVIEW_BASE_SIZE, _OVERVIEW_BASE_SIZE) + arr[0, 0] = np.nan + return xr.DataArray( + arr, + dims=("y", "x"), + attrs={ + "transform": _OVERVIEW_BASE_TRANSFORM, + "crs": _OVERVIEW_BASE_CRS, + }, + ) + + +def _overview_unique_tmp_path(tmp_path, label: str) -> str: + return str(tmp_path / f"release_gate_overview_{label}_{uuid.uuid4().hex}.tif") + + +def _overview_write_internal_cog(path: str) -> None: + """Write a COG with base + internal overviews at factors 2 and 4.""" + da = _overview_make_raster() + to_geotiff( + da, path, + nodata=_OVERVIEW_NODATA, + cog=True, + compression="deflate", + tiled=True, + tile_size=16, + overview_levels=list(_OVERVIEW_FACTORS), + overview_resampling="nearest", + ) + with rasterio.open(path) as ds: + assert ds.overviews(1) == list(_OVERVIEW_FACTORS), ( + f"writer did not emit the requested overview IFDs: " + f"got {ds.overviews(1)}, expected {list(_OVERVIEW_FACTORS)}" + ) + + +def _overview_write_external_sidecar(path: str) -> None: + """Write a tiled TIFF + ``.ovr`` sidecar at factors 2 and 4.""" + da = _overview_make_raster() + to_geotiff( + da, path, + nodata=_OVERVIEW_NODATA, + tiled=True, + tile_size=16, + ) + with rasterio.open(path) as ds: + assert ds.overviews(1) == [], ( + "base file must have no internal overviews before sidecar build" + ) + + with rasterio.Env(TIFF_USE_OVR="YES", COMPRESS_OVERVIEW="DEFLATE"): + with rasterio.open(path, "r+") as ds: + ds.build_overviews(list(_OVERVIEW_FACTORS), Resampling.nearest) + assert os.path.exists(path + ".ovr"), ( + "TIFF_USE_OVR=YES must produce a .ovr sidecar next to the base file" + ) + + +def _overview_assert_metadata_equal_across_levels( + attrs_by_level: dict, +) -> None: + base = attrs_by_level[0] + for key in _OVERVIEW_EQUAL_KEYS: + base_present = key in base + base_val = base.get(key) + for lvl, attrs in attrs_by_level.items(): + if lvl == 0: + continue + other_present = key in attrs + other_val = attrs.get(key) + assert other_present == base_present, ( + f"attrs[{key!r}] presence drifts: base={base_present}, " + f"level={lvl}: {other_present}" + ) + if base_present: + assert other_val == base_val, ( + f"attrs[{key!r}] differs across levels: " + f"base={base_val!r} level={lvl}: {other_val!r}" + ) + + +def _overview_assert_transform_scales( + attrs_by_level: dict, factors: dict, +) -> None: + base = attrs_by_level[0]["transform"] + base_a, base_b, base_c, base_d, base_e, base_f = base + for lvl, attrs in attrs_by_level.items(): + factor = factors[lvl] + t = attrs["transform"] + a, b, c, d, e, f = t + assert a == pytest.approx(base_a * factor), ( + f"level {lvl}: pixel width did not scale by {factor}: " + f"got {a}, expected {base_a * factor}" + ) + assert e == pytest.approx(base_e * factor), ( + f"level {lvl}: pixel height did not scale by {factor}: " + f"got {e}, expected {base_e * factor}" + ) + assert c == pytest.approx(base_c), ( + f"level {lvl}: origin x drifted: got {c}, expected {base_c}" + ) + assert f == pytest.approx(base_f), ( + f"level {lvl}: origin y drifted: got {f}, expected {base_f}" + ) + assert b == pytest.approx(0.0) and d == pytest.approx(0.0), ( + f"level {lvl}: axis-aligned transform must not gain rotation " + f"terms, got b={b}, d={d}" + ) + + +def _overview_read_levels_eager(path: str) -> dict: + out = {0: open_geotiff(path)} + for i, _ in enumerate(_OVERVIEW_FACTORS, start=1): + out[i] = open_geotiff(path, overview_level=i) + return out + + +def _overview_read_levels_dask(path: str) -> dict: + out = {0: read_geotiff_dask(path, chunks=8)} + for i, _ in enumerate(_OVERVIEW_FACTORS, start=1): + out[i] = read_geotiff_dask(path, chunks=8, overview_level=i) + return out + + +def _overview_factors_by_level() -> dict: + factors = {0: 1} + for i, f in enumerate(_OVERVIEW_FACTORS, start=1): + factors[i] = f + return factors + + +@pytest.mark.release_gate +@pytest.mark.parametrize("reader", ["eager", "dask"]) +def test_release_gate_cog_internal_overview_metadata_survives( + tmp_path, reader, +) -> None: + """COG with internal overviews preserves the metadata contract.""" + path = _overview_unique_tmp_path(tmp_path, f"cog_meta_{reader}") + _overview_write_internal_cog(path) + + if reader == "eager": + levels = _overview_read_levels_eager(path) + else: + levels = _overview_read_levels_dask(path) + + attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} + _overview_assert_metadata_equal_across_levels(attrs_by_level) + + base = attrs_by_level[0] + assert base.get("crs") == _OVERVIEW_BASE_CRS + assert base.get("crs_wkt"), "crs_wkt must be set on a CRS-carrying COG" + assert base.get("nodata") == _OVERVIEW_NODATA + assert base.get("masked_nodata") is True + assert base.get("georef_status") == "full" + + +@pytest.mark.release_gate +@pytest.mark.parametrize("reader", ["eager", "dask"]) +def test_release_gate_cog_internal_overview_transform_scales( + tmp_path, reader, +) -> None: + """COG with internal overviews preserves origin and scales pixel size.""" + path = _overview_unique_tmp_path(tmp_path, f"cog_xform_{reader}") + _overview_write_internal_cog(path) + + if reader == "eager": + levels = _overview_read_levels_eager(path) + else: + levels = _overview_read_levels_dask(path) + + attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} + _overview_assert_transform_scales( + attrs_by_level, _overview_factors_by_level(), + ) + + +@pytest.mark.release_gate +def test_release_gate_cog_internal_overview_shape_matches_factors( + tmp_path, +) -> None: + """Shapes follow the decimation factors so the test exercises real overview IFDs.""" + path = _overview_unique_tmp_path(tmp_path, "cog_shape") + _overview_write_internal_cog(path) + + base = open_geotiff(path) + assert base.shape == (_OVERVIEW_BASE_SIZE, _OVERVIEW_BASE_SIZE) + for i, factor in enumerate(_OVERVIEW_FACTORS, start=1): + da = open_geotiff(path, overview_level=i) + expected = _OVERVIEW_BASE_SIZE // factor + assert da.shape == (expected, expected), ( + f"overview_level={i} returned shape {da.shape}, " + f"expected ({expected}, {expected})" + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("reader", ["eager", "dask"]) +def test_release_gate_sidecar_overview_metadata_survives( + tmp_path, reader, +) -> None: + """External `.ovr` sidecar preserves the metadata contract.""" + path = _overview_unique_tmp_path(tmp_path, f"sidecar_meta_{reader}") + _overview_write_external_sidecar(path) + + if reader == "eager": + levels = _overview_read_levels_eager(path) + else: + levels = _overview_read_levels_dask(path) + + attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} + _overview_assert_metadata_equal_across_levels(attrs_by_level) + + base = attrs_by_level[0] + assert base.get("crs") == _OVERVIEW_BASE_CRS + assert base.get("crs_wkt"), ( + "crs_wkt must be set when the base file carries an EPSG code" + ) + assert base.get("nodata") == _OVERVIEW_NODATA + assert base.get("masked_nodata") is True + assert base.get("georef_status") == "full" + + +@pytest.mark.release_gate +@pytest.mark.parametrize("reader", ["eager", "dask"]) +def test_release_gate_sidecar_overview_transform_scales( + tmp_path, reader, +) -> None: + """External `.ovr` sidecar scales pixel size by 2 per level, origin held.""" + path = _overview_unique_tmp_path(tmp_path, f"sidecar_xform_{reader}") + _overview_write_external_sidecar(path) + + if reader == "eager": + levels = _overview_read_levels_eager(path) + else: + levels = _overview_read_levels_dask(path) + + attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} + _overview_assert_transform_scales( + attrs_by_level, _overview_factors_by_level(), + ) + + +@pytest.mark.release_gate +def test_release_gate_sidecar_overview_shape_matches_factors( + tmp_path, +) -> None: + """Sidecar reads return the right shape per level.""" + path = _overview_unique_tmp_path(tmp_path, "sidecar_shape") + _overview_write_external_sidecar(path) + + base = open_geotiff(path) + assert base.shape == (_OVERVIEW_BASE_SIZE, _OVERVIEW_BASE_SIZE) + for i, factor in enumerate(_OVERVIEW_FACTORS, start=1): + da = open_geotiff(path, overview_level=i) + expected = _OVERVIEW_BASE_SIZE // factor + assert da.shape == (expected, expected), ( + f"sidecar overview_level={i} returned shape {da.shape}, " + f"expected ({expected}, {expected})" + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("reader", ["eager", "dask"]) +def test_release_gate_internal_vs_sidecar_metadata_agree( + tmp_path, reader, +) -> None: + """Internal COG and external sidecar agree on the metadata contract.""" + cog_path = _overview_unique_tmp_path(tmp_path, f"parity_cog_{reader}") + sidecar_path = _overview_unique_tmp_path( + tmp_path, f"parity_sidecar_{reader}", + ) + _overview_write_internal_cog(cog_path) + _overview_write_external_sidecar(sidecar_path) + + if reader == "eager": + cog_levels = _overview_read_levels_eager(cog_path) + sidecar_levels = _overview_read_levels_eager(sidecar_path) + else: + cog_levels = _overview_read_levels_dask(cog_path) + sidecar_levels = _overview_read_levels_dask(sidecar_path) + + assert set(cog_levels) == set(sidecar_levels) + for lvl in cog_levels: + for key in _OVERVIEW_EQUAL_KEYS: + cog_attrs = cog_levels[lvl].attrs + sidecar_attrs = sidecar_levels[lvl].attrs + assert (key in cog_attrs) == (key in sidecar_attrs), ( + f"level {lvl}: attrs[{key!r}] presence differs between " + f"internal-COG and sidecar reads " + f"(cog={key in cog_attrs}, sidecar={key in sidecar_attrs})" + ) + if key in cog_attrs: + assert cog_attrs[key] == sidecar_attrs[key], ( + f"level {lvl}: attrs[{key!r}] differs between " + f"internal-COG and sidecar reads: " + f"cog={cog_attrs[key]!r}, " + f"sidecar={sidecar_attrs[key]!r}" + ) + assert cog_levels[lvl].attrs["transform"] == pytest.approx( + sidecar_levels[lvl].attrs["transform"] + ), ( + f"level {lvl}: transform differs between internal-COG and " + f"sidecar reads" + ) + + +# =========================================================================== # +# Section: Windowed reads -- shifted-transform parity (epic #2341) # +# =========================================================================== # +# +# Windowed reads must return shapes matching the requested window, +# coords that are a bit-exact slice of the unwindowed read, an +# ``attrs['transform']`` shifted by ``Affine.translation(col_off, +# row_off)`` exactly, and the canonical non-transform release attrs +# unchanged. + +_WSP_HAS_TIFFFILE = importlib.util.find_spec("tifffile") is not None +_wsp_skip_no_tifffile = pytest.mark.skipif( + not _WSP_HAS_TIFFFILE, + reason="tifffile required for MinIsWhite fixture", +) + +_WSP_FULL_H = 256 +_WSP_FULL_W = 256 + +_WSP_WINDOWS = ( + pytest.param( + (32, 64, 96, 192), id="aligned-row32-col64-h64-w128", + ), + pytest.param( + (33, 65, 95, 191), id="chunk-misaligned-row33-col65-h62-w126", + ), +) + +_WSP_PIXEL_WIDTH = 30.0 +_WSP_PIXEL_HEIGHT = -25.0 +_WSP_ORIGIN_X = 500123.5 +_WSP_ORIGIN_Y = 4001987.25 + +_WSP_NON_TRANSFORM_ATTRS_VALUE_EQUAL = ( + "crs", + "crs_wkt", + "nodata", + "georef_status", + "raster_type", +) +_WSP_NON_TRANSFORM_ATTRS_STRUCTURAL = ("masked_nodata",) + + +def _wsp_build_da(arr: np.ndarray, nodata=None) -> xr.DataArray: + h, w = arr.shape + x_centers = ( + _WSP_ORIGIN_X + _WSP_PIXEL_WIDTH * 0.5 + + _WSP_PIXEL_WIDTH * np.arange(w) + ) + y_centers = ( + _WSP_ORIGIN_Y + _WSP_PIXEL_HEIGHT * 0.5 + + _WSP_PIXEL_HEIGHT * np.arange(h) + ) + attrs = {"crs": 32610} + if nodata is not None: + attrs["nodata"] = nodata + return xr.DataArray( + arr, + dims=("y", "x"), + coords={"y": y_centers.astype(np.float64), + "x": x_centers.astype(np.float64)}, + attrs=attrs, + ) + + +def _wsp_write_int16_with_nodata(path: Path) -> None: + rng = np.random.default_rng(2341) + arr = rng.integers( + -1000, 1000, size=(_WSP_FULL_H, _WSP_FULL_W), dtype=np.int16, + ) + arr[10, 10] = -9999 + arr[200, 5] = -9999 + da = _wsp_build_da(arr, nodata=-9999) + to_geotiff(da, str(path), compression="deflate", tiled=False) + + +def _wsp_write_float32_with_nan_nodata(path: Path) -> None: + rng = np.random.default_rng(2342) + arr = (rng.standard_normal((_WSP_FULL_H, _WSP_FULL_W)) * 100).astype( + np.float32, + ) + arr[40, 80] = np.nan + da = _wsp_build_da(arr, nodata=np.float32("nan")) + to_geotiff( + da, str(path), compression="deflate", tiled=True, tile_size=64, + ) + + +def _wsp_write_float32_no_nodata(path: Path) -> None: + rng = np.random.default_rng(2343) + arr = (rng.standard_normal((_WSP_FULL_H, _WSP_FULL_W)) * 50).astype( + np.float32, + ) + da = _wsp_build_da(arr, nodata=None) + to_geotiff(da, str(path), compression="none", tiled=False) + + +def _wsp_write_uint8_miniswhite(path: Path) -> None: + """Write a MinIsWhite (photometric=0) uint8 stripped TIFF via tifffile.""" + import tifffile # local import: gated by _wsp_skip_no_tifffile + rng = np.random.default_rng(2344) + arr = rng.integers(0, 256, size=(_WSP_FULL_H, _WSP_FULL_W), dtype=np.uint8) + tifffile.imwrite( + str(path), arr, photometric="miniswhite", + compression="none", metadata=None, + ) + + +_WSP_CORPUS = ( + pytest.param( + _wsp_write_int16_with_nodata, id="int16-deflate-stripped-nodata", + ), + pytest.param( + _wsp_write_float32_with_nan_nodata, + id="float32-deflate-tiled-nan-nodata", + ), + pytest.param( + _wsp_write_float32_no_nodata, id="float32-none-stripped-no-nodata", + ), + pytest.param( + _wsp_write_uint8_miniswhite, + id="uint8-miniswhite-stripped", + marks=_wsp_skip_no_tifffile, + ), +) + + +@pytest.fixture +def _wsp_corpus_file(tmp_path, request): + """Write a single fixture file and return its on-disk path.""" + builder = request.param + tag = uuid.uuid4().hex[:8] + path = tmp_path / f"release_gate_wsp_{builder.__name__[1:]}_{tag}.tif" + builder(path) + return path + + +def _wsp_expected_window_transform(t_full, col_off, row_off): + a, b, c, d, e, f = (float(x) for x in t_full) + return ( + a, + b, + c + a * col_off + b * row_off, + d, + e, + f + d * col_off + e * row_off, + ) + + +def _wsp_open_eager(path, *, window=None): + return open_geotiff(str(path), window=window) + + +def _wsp_open_dask(path, *, window=None): + return read_geotiff_dask(str(path), window=window, chunks=32) + + +_WSP_READERS = ( + pytest.param(_wsp_open_eager, id="eager"), + pytest.param(_wsp_open_dask, id="dask"), +) + + +def _wsp_assert_shape(out, *, expected_h, expected_w): + assert out.shape == (expected_h, expected_w), ( + f"release gate: windowed read shape {out.shape} does not equal " + f"the requested window shape {(expected_h, expected_w)}; a window " + f"that returns the wrong shape is silently wrong, not noisily wrong" + ) + + +def _wsp_assert_coords_slice( + windowed, full, *, row_off, col_off, height, width, +): + full_y = np.asarray(full.coords["y"].values) + full_x = np.asarray(full.coords["x"].values) + win_y = np.asarray(windowed.coords["y"].values) + win_x = np.asarray(windowed.coords["x"].values) + np.testing.assert_array_equal( + win_y, + full_y[row_off:row_off + height], + err_msg=( + "release gate: windowed read y-coords are not a slice of the " + "unwindowed read's y-coords; downstream callers that join on " + "y will silently mismatch" + ), + ) + np.testing.assert_array_equal( + win_x, + full_x[col_off:col_off + width], + err_msg=( + "release gate: windowed read x-coords are not a slice of the " + "unwindowed read's x-coords" + ), + ) + + +def _wsp_assert_transform_shifted(windowed, full, *, col_off, row_off): + if "transform" not in full.attrs: + assert "transform" not in windowed.attrs, ( + f"release gate: source has no georef and the unwindowed read " + f"emits no ``transform``, but the windowed read invented one: " + f"{windowed.attrs.get('transform')!r}" + ) + return + t_full = tuple(full.attrs["transform"]) + assert "transform" in windowed.attrs, ( + f"release gate: unwindowed read carries ``transform`` " + f"({t_full!r}) but the windowed read dropped it" + ) + t_win = tuple(windowed.attrs["transform"]) + assert len(t_full) == 6, ( + f"release gate: full-read transform is not a 6-tuple: {t_full!r}" + ) + assert len(t_win) == 6, ( + f"release gate: windowed-read transform is not a 6-tuple: {t_win!r}" + ) + expected = _wsp_expected_window_transform(t_full, col_off, row_off) + assert t_win == expected, ( + f"release gate: windowed transform does not equal " + f"T_full * Affine.translation(col_off={col_off}, " + f"row_off={row_off})\n" + f" T_full = {t_full!r}\n" + f" T_window = {t_win!r}\n" + f" expected = {expected!r}" + ) + + +def _wsp_assert_canonical_attrs_unchanged(windowed, full): + for key in _WSP_NON_TRANSFORM_ATTRS_VALUE_EQUAL: + if key not in full.attrs: + assert key not in windowed.attrs, ( + f"release gate: windowed read introduced attrs[{key!r}] " + f"that the unwindowed read does not have" + ) + continue + assert key in windowed.attrs, ( + f"release gate: windowed read dropped attrs[{key!r}] that the " + f"unwindowed read carries" + ) + full_val = full.attrs[key] + win_val = windowed.attrs[key] + try: + full_is_nan = bool(np.isnan(full_val)) + except (TypeError, ValueError): + full_is_nan = False + if full_is_nan: + try: + win_is_nan = bool(np.isnan(win_val)) + except (TypeError, ValueError): + win_is_nan = False + assert win_is_nan, ( + f"release gate: NaN-valued attrs[{key!r}] did not survive " + f"the windowed read: full={full_val!r} window={win_val!r}" + ) + else: + assert win_val == full_val, ( + f"release gate: windowed read changed attrs[{key!r}]: " + f"full={full_val!r} window={win_val!r}" + ) + for key in _WSP_NON_TRANSFORM_ATTRS_STRUCTURAL: + full_present = key in full.attrs + win_present = key in windowed.attrs + assert full_present == win_present, ( + f"release gate: windowed read changed presence of " + f"attrs[{key!r}]: full_has={full_present} " + f"window_has={win_present}" + ) + if full_present: + assert isinstance(windowed.attrs[key], bool), ( + f"release gate: attrs[{key!r}] is not a bool on the " + f"windowed read: {windowed.attrs[key]!r}" + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("window", _WSP_WINDOWS) +@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("reader", _WSP_READERS) +def test_release_gate_windowed_read_shape(_wsp_corpus_file, reader, window): + """The returned shape equals the window's ``(height, width)``.""" + row_off, col_off, row_stop, col_stop = window + out = reader(_wsp_corpus_file, window=window) + _wsp_assert_shape( + out, + expected_h=row_stop - row_off, + expected_w=col_stop - col_off, + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("window", _WSP_WINDOWS) +@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("reader", _WSP_READERS) +def test_release_gate_windowed_read_coords_slice( + _wsp_corpus_file, reader, window, +): + """``coords['y'/'x']`` equals the matching slice of the full coords.""" + row_off, col_off, row_stop, col_stop = window + full = reader(_wsp_corpus_file) + out = reader(_wsp_corpus_file, window=window) + _wsp_assert_coords_slice( + out, full, + row_off=row_off, col_off=col_off, + height=row_stop - row_off, width=col_stop - col_off, + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("window", _WSP_WINDOWS) +@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("reader", _WSP_READERS) +def test_release_gate_windowed_read_transform_shifted( + _wsp_corpus_file, reader, window, +): + """``attrs['transform']`` equals ``T_full * translation(col, row)``.""" + row_off, col_off, _row_stop, _col_stop = window + full = reader(_wsp_corpus_file) + out = reader(_wsp_corpus_file, window=window) + _wsp_assert_transform_shifted( + out, full, col_off=col_off, row_off=row_off, + ) + + +@pytest.mark.release_gate +@pytest.mark.parametrize("window", _WSP_WINDOWS) +@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("reader", _WSP_READERS) +def test_release_gate_windowed_read_canonical_attrs_unchanged( + _wsp_corpus_file, reader, window, +): + """The non-transform canonical attrs match the unwindowed read.""" + full = reader(_wsp_corpus_file) + out = reader(_wsp_corpus_file, window=window) + _wsp_assert_canonical_attrs_unchanged(out, full) + + +# =========================================================================== # +# Section: Negative cases -- ambiguous metadata fails closed (epic #2341) # +# =========================================================================== # +# +# When metadata is ambiguous and the caller did NOT opt in via the +# documented flag, every promised read entry point raises a typed error +# whose message names the unlocking flag and points at the release-contract +# docs. + +_NEG_RELEASE_CONTRACT_HINTS = ( + "release_gate_geotiff", + "geotiff_release_contract", + "#2341", + "#1987", + "#2342", + "release contract", +) + + +def _neg_msg_cites_release_contract(msg: str) -> bool: + return any(hint in msg for hint in _NEG_RELEASE_CONTRACT_HINTS) + + +def _neg_tmp(tmp_path, label: str, *, suffix: str = ".tif") -> str: + return str( + tmp_path / f"release_gate_neg_{label}_{uuid.uuid4().hex}{suffix}" + ) + + +# Hand-rolled rotated TIFF builder (case 3). The writer refuses rotated +# transforms at the boundary, so a round-trip through xrspatial cannot +# reproduce one. The 30-degree rotation matches +# ``test_allow_rotated_geotiff_2115.py`` so the gate rejects the same +# input shape that test pins behaviourally. + +_NEG_TAG_MODEL_TRANSFORMATION = 34264 +_NEG_COS30 = 0.8660254037844387 +_NEG_SIN30 = 0.5 +_NEG_ROTATED_M = ( + 10.0 * _NEG_COS30, -10.0 * _NEG_SIN30, 0.0, 100.0, + 10.0 * _NEG_SIN30, 10.0 * _NEG_COS30, 0.0, 200.0, + 0.0, 0.0, 1.0, 0.0, + 0.0, 0.0, 0.0, 1.0, +) + + +def _neg_write_rotated_tiff(path: str, arr: np.ndarray) -> None: + """Emit a minimal little-endian TIFF with a rotated ModelTransformationTag.""" + h, w = arr.shape + arr = np.ascontiguousarray(arr.astype(' None: + """Emit a 2x2 uint16 TIFF whose GDAL_NODATA tag holds ``nodata_str``.""" + bo = '<' + width, height = 2, 2 + pixels = np.array([[10, 20], [30, 40]], dtype=np.uint16) + + nodata_bytes = nodata_str.encode('ascii') + b'\x00' + + tag_list: list[tuple[int, int, int, bytes]] = [] + + def add_short(tag: int, val: int) -> None: + tag_list.append((tag, 3, 1, struct.pack(f'{bo}H', val))) + + def add_long(tag: int, val: int) -> None: + tag_list.append((tag, 4, 1, struct.pack(f'{bo}I', val))) + + def add_ascii(tag: int, data: bytes) -> None: + tag_list.append((tag, 2, len(data), data)) + + add_short(256, width) + add_short(257, height) + add_short(258, 16) + add_short(259, 1) + add_short(262, 1) + add_short(277, 1) + add_short(278, height) + add_short(339, 1) + add_long(273, 0) + add_long(279, len(pixels.tobytes())) + add_ascii(42113, nodata_bytes) # GDAL_NODATA + + tag_list.sort(key=lambda t: t[0]) + + header_size = 8 + num_entries = len(tag_list) + ifd_size = 2 + 12 * num_entries + 4 + ifd_off = header_size + + overflow = bytearray() + overflow_start = header_size + ifd_size + + overflow_offsets: dict[int, int | None] = {} + for tag, _typ, _count, raw in tag_list: + if len(raw) > 4: + overflow_offsets[tag] = len(overflow) + overflow.extend(raw) + if len(overflow) % 2: + overflow.append(0) + else: + overflow_offsets[tag] = None + + pixel_start = overflow_start + len(overflow) + + patched: list[tuple[int, int, int, bytes]] = [] + for tag, typ, count, raw in tag_list: + if tag == 273: + patched.append( + (tag, typ, count, struct.pack(f'{bo}I', pixel_start)) + ) + else: + patched.append((tag, typ, count, raw)) + tag_list = patched + + out = bytearray() + out.extend(b'II') + out.extend(struct.pack(f'{bo}H', 42)) + out.extend(struct.pack(f'{bo}I', ifd_off)) + out.extend(struct.pack(f'{bo}H', num_entries)) + for tag, typ, count, raw in tag_list: + out.extend(struct.pack(f'{bo}HHI', tag, typ, count)) + if len(raw) <= 4: + out.extend(raw.ljust(4, b'\x00')) + else: + ptr = overflow_start + overflow_offsets[tag] + out.extend(struct.pack(f'{bo}I', ptr)) + out.extend(struct.pack(f'{bo}I', 0)) + out.extend(overflow) + out.extend(pixels.tobytes()) + + with open(path, 'wb') as f: + f.write(out) + + +@pytest.mark.release_gate +@pytest.mark.xfail( + reason=( + "xrspatial.geotiff does not yet read ``.aux.xml`` PAM sidecars " + "(no entry in SUPPORTED_FEATURES). When that support lands, the " + "reader must fail closed on a CRS conflict between the header and " + "the sidecar; this xfail flips to a pass at that point. Tracked " + "alongside the PAM sidecar epic." + ), + strict=False, +) +def test_release_gate_negative_conflicting_aux_xml_crs(tmp_path) -> None: + """The reader must not silently choose between header CRS and sidecar CRS.""" + path = _neg_tmp(tmp_path, "case1_aux_xml_crs") + pixels = np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32) + write( + pixels, + path, + geo_transform=GeoTransform( + origin_x=0.0, origin_y=0.0, + pixel_width=1.0, pixel_height=-1.0, + ), + crs_epsg=4326, + compression="none", + tiled=False, + ) + sidecar = Path(path + ".aux.xml") + sidecar.write_text( + '\n' + '\n' + ' EPSG:3857\n' + '\n', + encoding="utf-8", + ) + with pytest.raises(GeoTIFFAmbiguousMetadataError) as excinfo: + open_geotiff(path) + msg = str(excinfo.value) + assert "aux.xml" in msg or "sidecar" in msg or "PAM" in msg, ( + f"expected the error message to name the .aux.xml / PAM sidecar; " + f"got: {msg!r}" + ) + assert _neg_msg_cites_release_contract(msg), ( + f"expected the error message to cite the release-contract docs " + f"or the tracking issue; got: {msg!r}" + ) + + +@pytest.mark.release_gate +@pytest.mark.xfail( + reason=( + "Issue #1774 currently treats a non-finite or fractional integer " + "nodata sentinel as a silent no-op rather than a hard error. The " + "release promise is to upgrade the no-op to a typed rejection so " + "the caller sees the silent-coercion risk; this xfail flips to a " + "pass when the upgrade lands." + ), + strict=False, +) +def test_release_gate_negative_integer_nodata_float_promoted( + tmp_path, +) -> None: + """The reader must not silently coerce a non-finite int-file nodata sentinel.""" + path = _neg_tmp(tmp_path, "case2_int_nodata_float_promoted") + _neg_build_uint16_tiff_with_nodata("nan", path) + with pytest.raises(GeoTIFFAmbiguousMetadataError) as excinfo: + open_geotiff(path) + msg = str(excinfo.value) + assert "nodata" in msg.lower(), ( + f"expected the error message to name nodata; got: {msg!r}" + ) + assert _neg_msg_cites_release_contract(msg), ( + f"expected the error message to cite the release-contract docs " + f"or the tracking issue; got: {msg!r}" + ) + + +_NEG_ROTATED_PIXELS = np.arange(20, dtype=' str: + """A throwaway rotated GeoTIFF that the rotated case's sub-tests share.""" + path = _neg_tmp(tmp_path, "case3_rotated") + _neg_write_rotated_tiff(path, _NEG_ROTATED_PIXELS) + return path + + +def _neg_assert_rotated_message(msg: str) -> None: + """Shared assertions on the rotated error message.""" + assert "allow_rotated" in msg, ( + f"expected the error message to name the ``allow_rotated`` " + f"opt-in; got: {msg!r}" + ) + assert any( + tier in msg for tier in ("advanced", "experimental", "stable") + ), ( + f"expected the error message to name the feature tier; " + f"got: {msg!r}" + ) + assert _neg_msg_cites_release_contract(msg), ( + f"expected the error message to cite the release-contract docs " + f"or the tracking issue; got: {msg!r}" + ) + + +@pytest.mark.release_gate +def test_release_gate_negative_rotated_eager( + _neg_rotated_geotiff_path, +) -> None: + """Eager numpy path raises ``RotatedTransformError`` without the opt-in.""" + with pytest.raises(RotatedTransformError) as excinfo: + open_geotiff(_neg_rotated_geotiff_path) + _neg_assert_rotated_message(str(excinfo.value)) + + +@pytest.mark.release_gate +def test_release_gate_negative_rotated_dask( + _neg_rotated_geotiff_path, +) -> None: + """Dask path raises the same typed error, uniformly with the eager path.""" + with pytest.raises(RotatedTransformError) as excinfo: + read_geotiff_dask(_neg_rotated_geotiff_path, chunks=2) + _neg_assert_rotated_message(str(excinfo.value)) + + +@pytest.mark.release_gate +def test_release_gate_negative_rotated_windowed( + _neg_rotated_geotiff_path, +) -> None: + """Windowed read raises the same typed error before pixel decode.""" + with pytest.raises(RotatedTransformError) as excinfo: + open_geotiff(_neg_rotated_geotiff_path, window=(0, 0, 2, 2)) + _neg_assert_rotated_message(str(excinfo.value)) + + +@pytest.mark.release_gate +@requires_gpu +def test_release_gate_negative_rotated_gpu( + _neg_rotated_geotiff_path, +) -> None: + """GPU read raises the same typed error as the CPU paths. + + The rotated-transform refusal is upstream of the GPU decode path -- + the validator fires on the header read, before any pixel buffer + reaches the GPU -- so the same typed error surfaces here regardless + of the GPU tier. + """ + with pytest.raises(RotatedTransformError) as excinfo: + open_geotiff(_neg_rotated_geotiff_path, gpu=True) + _neg_assert_rotated_message(str(excinfo.value)) + + +@pytest.mark.release_gate +@pytest.mark.xfail( + reason=( + "The VRT stable-only knob is owned by epic #2342 and has not " + "landed yet. The release promise: when the caller asks for " + "stable-only sources and a VRT child uses an experimental codec, " + "the reader names the offending child and the opt-in flag. This " + "xfail flips to a pass when #2342 ships the knob." + ), + strict=False, +) +def test_release_gate_negative_mixed_tier_vrt_children(tmp_path) -> None: + """The reader must refuse mixed-tier VRT children when stable-only is asked. + + XFAIL-to-PASS transition note + ----------------------------- + Today this test fails with ``TypeError: unexpected keyword argument + 'stable_only'`` because epic #2342 has not landed the kwarg yet. The + strict=False xfail swallows that TypeError. When #2342 lands, the + test will start raising :class:`GeoTIFFAmbiguousMetadataError` (or + fail to raise) and the xfail will report XPASS. Before removing the + xfail marker, confirm the new code path satisfies both inline + assertions: the error message must mention either ``stable_only`` or + ``allow_experimental_codecs``, and it must cite the release contract + docs. If either assertion would not pass, fix the production message + in the same PR that removes the xfail. + """ + path = _neg_tmp(tmp_path, "case4_mixed_tier_vrt", suffix=".vrt") + Path(path).write_text( + '\n', + encoding="utf-8", + ) + with pytest.raises(GeoTIFFAmbiguousMetadataError) as excinfo: + open_geotiff(path, stable_only=True) # type: ignore[call-arg] + msg = str(excinfo.value) + assert "stable_only" in msg or "allow_experimental_codecs" in msg, ( + f"expected the error message to name the unlocking opt-in; " + f"got: {msg!r}" + ) + assert _neg_msg_cites_release_contract(msg), ( + f"expected the error message to cite the release-contract docs " + f"or the tracking issue; got: {msg!r}" + ) + + +# =========================================================================== # +# Section: Cross-cutting meta-gates (#2321) # +# =========================================================================== # +# +# Checklist-parity gates: every file cited in the release-gate checklist +# exists on disk, every promised SUPPORTED_FEATURES key is mentioned in +# the prose, the HTTP SSRF presence gate stays loud, and every cited VRT +# test file actually carries test functions. + +_META_HERE = Path(__file__).resolve() +_META_REPO_ROOT = _META_HERE.parents[4] +_META_CHECKLIST = ( + _META_REPO_ROOT + / "docs" / "source" / "reference" / "release_gate_geotiff.rst" +) + +# Match xrspatial/geotiff/tests/.py inside the checklist body. The +# leaf may include slashes (e.g. ``release_gates/test_stable_features.py``). +_META_TEST_PATH_RE = re.compile( + r"`{0,2}(xrspatial/geotiff/tests/[\w/]+\.py)`{0,2}", +) + +# The new home of this file. The cited-test parser drops the self-reference +# so the gate cannot succeed only because it cites itself. +_META_SELF_REF = ( + "xrspatial/geotiff/tests/release_gates/test_stable_features.py" +) + + +def _meta_checklist_text() -> str: + assert _META_CHECKLIST.is_file(), ( + f"release gate checklist missing: {_META_CHECKLIST}; the " + "checklist must ship with the geotiff docs so release notes can " + "cite it" + ) + return _META_CHECKLIST.read_text(encoding="utf-8") + + +def _meta_cited_test_files() -> set[str]: + text = _meta_checklist_text() + matches = set(_META_TEST_PATH_RE.findall(text)) + matches.discard(_META_SELF_REF) + return matches + + +@pytest.mark.release_gate +def test_release_gate_cites_only_existing_test_files() -> None: + """Every test file cited in the release gate checklist exists on disk.""" + cited = _meta_cited_test_files() + assert cited, ( + "release gate checklist cites zero test files; either the regex " + "is wrong or the checklist is empty" + ) + missing = sorted(p for p in cited if not (_META_REPO_ROOT / p).is_file()) + assert not missing, ( + "release gate checklist cites tests that do not exist on disk; " + "rename the checklist row to a real file or restore the test: " + f"{missing}" + ) + # Tighten: every cited path must point at a ``test_*.py`` file, not at + # ``conftest.py`` or a helper module. + non_test = sorted( + p for p in cited if not Path(p).name.startswith("test_") + ) + assert not non_test, ( + "release gate checklist cites paths that do not start with " + "``test_``; the checklist should point at regression tests, not " + f"conftest or helper modules: {non_test}" + ) + + +_META_PROMISED_TIERS = {"stable", "advanced"} + + +def _meta_checklist_mentions(text: str, key: str) -> bool: + """``key`` is something like ``reader.local_file``.""" + if key in text: + return True + return f"SUPPORTED_FEATURES['{key}']" in text + + +@pytest.mark.release_gate +def test_release_gate_lists_every_promised_supported_feature() -> None: + """Every promised SUPPORTED_FEATURES entry appears in the checklist.""" + text = _meta_checklist_text() + missing = [] + for key, tier in SUPPORTED_FEATURES.items(): + if tier not in _META_PROMISED_TIERS: + continue + if key.startswith("codec."): + continue + if not _meta_checklist_mentions(text, key): + missing.append((key, tier)) + assert not missing, ( + "promised SUPPORTED_FEATURES entries are missing from the release " + "gate checklist; add a row (or update SUPPORTED_FEATURES) so the " + "release notes can quote the tier: " + f"{missing}" + ) + + +@pytest.mark.release_gate +def test_release_gate_http_ssrf_rejects_loopback() -> None: + """HTTP URLs targeting loopback hosts raise :class:`UnsafeURLError`.""" + with pytest.raises(UnsafeURLError): + # No network call -- the SSRF check rejects before the socket + # opens, so this test is offline-safe. + open_geotiff("http://127.0.0.1/does-not-matter.tif") + + +@pytest.mark.release_gate +@pytest.mark.xfail( + reason=( + "Locks in once sub-PR 5 of #2321 (PR #2326) lands. Until then, " + "uppercase HTTP slips past the SSRF check and falls through to " + "fsspec, which raises a generic ValueError. Once #2326 is merged, " + "remove this xfail marker so the release gate enforces the " + "promise." + ), + strict=False, + raises=(ValueError, UnsafeURLError), +) +def test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme() -> None: + """Uppercase HTTP scheme must take the same SSRF path.""" + with pytest.raises(UnsafeURLError): + open_geotiff("HTTP://127.0.0.1/does-not-matter.tif") + + +def _meta_vrt_section_test_files() -> set[str]: + """Return the test files cited inside the VRT supported subset section.""" + text = _meta_checklist_text() + start = text.find("VRT supported subset") + assert start != -1, "checklist is missing the VRT supported subset section" + end = text.find("Sidecar and overview interactions", start) + if end == -1: + end = len(text) + section = text[start:end] + return set(_META_TEST_PATH_RE.findall(section)) + + +@pytest.mark.release_gate +def test_release_gate_vrt_rows_point_at_real_test_functions() -> None: + """Every VRT row cites a file that contains at least one ``def test_``.""" + files = _meta_vrt_section_test_files() + assert files, "no VRT test files cited in the checklist" + empty = [] + for rel in sorted(files): + if rel == _META_SELF_REF: + continue + path = _META_REPO_ROOT / rel + if not path.is_file(): + # Caught by the cited-test-files gate; skip here. + continue + body = path.read_text(encoding="utf-8") + if "def test_" not in body: + empty.append(rel) + assert not empty, ( + "VRT checklist rows cite files with no test functions; either the " + "file was emptied or the row should be removed: " + f"{empty}" + ) diff --git a/xrspatial/geotiff/tests/test_release_contract_parity_2389.py b/xrspatial/geotiff/tests/test_release_contract_parity_2389.py index daef10bd..be79826c 100644 --- a/xrspatial/geotiff/tests/test_release_contract_parity_2389.py +++ b/xrspatial/geotiff/tests/test_release_contract_parity_2389.py @@ -8,11 +8,12 @@ The tier strings here match the strings in ``xrspatial.geotiff.SUPPORTED_FEATURES`` at runtime. -Before this test, nothing in CI checked that claim. The sibling -``test_release_gate_2321.py`` parses ``release_gate_geotiff.rst``, not -this ``.md`` contract page, so the contract could (and did) silently -drift the next time a key was re-tiered in ``_attrs.py`` -- -twice in two releases (#2381 and #2389). +Before this test, nothing in CI checked that claim. The sibling release +gate registry (``release_gates/test_stable_features.py``, +``Cross-cutting meta-gates`` section) parses +``release_gate_geotiff.rst``, not this ``.md`` contract page, so the +contract could (and did) silently drift the next time a key was +re-tiered in ``_attrs.py`` -- twice in two releases (#2381 and #2389). What this test pins ------------------- @@ -29,7 +30,7 @@ those are human-readable labels, not runtime tier strings. * Locking the contract page against ``release_gate_geotiff.rst`` -- the gate page only enumerates ``stable`` and ``advanced`` tiers - (``test_release_gate_2321.py`` already covers that side). + (``release_gates/test_stable_features.py`` already covers that side). """ from __future__ import annotations diff --git a/xrspatial/geotiff/tests/test_release_gate_2321.py b/xrspatial/geotiff/tests/test_release_gate_2321.py deleted file mode 100644 index cf231b16..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_2321.py +++ /dev/null @@ -1,221 +0,0 @@ -"""Release gate / audit checklist parity tests (issue #2321 sub-task 6). - -Background ----------- -``docs/source/reference/release_gate_geotiff.rst`` enumerates every -feature the GeoTIFF module promises in release notes, along with the -regression test that locks each behaviour. Two things can break the -checklist's audit value: - -1. A cited regression test file is renamed or removed and the checklist - silently points at nothing. -2. A new tier key shows up in ``SUPPORTED_FEATURES`` and the checklist - forgets to add a row for it. - -These tests pin both. They are intentionally light -- they parse the -``.rst`` source and the ``SUPPORTED_FEATURES`` dict, then cross-check. - -What this test pins -------------------- -* Every test file cited in the release gate checklist exists on disk. -* Every key in :data:`xrspatial.geotiff.SUPPORTED_FEATURES` whose tier - is ``stable`` or ``advanced`` is named at least once in the checklist - prose, so a new public tier cannot land without a checklist row. -* The HTTP SSRF presence gate (the checklist's cross-cutting row that - has no other home today) is locked here: an HTTP URL pointing at a - loopback host raises :class:`UnsafeURLError` from ``open_geotiff``. -* The VRT presence gate: every test file cited in the "VRT supported - subset" section of the checklist contains at least one ``def test_`` - function, so the row is not pointing at an empty file. -""" -from __future__ import annotations - -import re -from pathlib import Path - -import pytest - -from xrspatial.geotiff import SUPPORTED_FEATURES, UnsafeURLError, open_geotiff - -# --------------------------------------------------------------------------- # -# Locate the checklist. # -# --------------------------------------------------------------------------- # - -_HERE = Path(__file__).resolve() -_REPO_ROOT = _HERE.parents[3] -_CHECKLIST = ( - _REPO_ROOT / "docs" / "source" / "reference" / "release_gate_geotiff.rst" -) - -# Match xrspatial/geotiff/tests/.py inside the checklist body. -_TEST_PATH_RE = re.compile( - r"`{0,2}(xrspatial/geotiff/tests/[\w/]+\.py)`{0,2}", -) - - -def _checklist_text() -> str: - assert _CHECKLIST.is_file(), ( - f"release gate checklist missing: {_CHECKLIST}; the checklist must " - "ship with the geotiff docs so release notes can cite it" - ) - return _CHECKLIST.read_text(encoding="utf-8") - - -# --------------------------------------------------------------------------- # -# Gate 1: cited test files exist. # -# --------------------------------------------------------------------------- # - - -def _cited_test_files() -> set[str]: - text = _checklist_text() - # Drop the self-reference so the gate cannot succeed only because it - # cites itself. - self_ref = "xrspatial/geotiff/tests/test_release_gate_2321.py" - matches = set(_TEST_PATH_RE.findall(text)) - matches.discard(self_ref) - return matches - - -def test_release_gate_cites_only_existing_test_files() -> None: - cited = _cited_test_files() - assert cited, ( - "release gate checklist cites zero test files; either the regex " - "is wrong or the checklist is empty" - ) - missing = sorted(p for p in cited if not (_REPO_ROOT / p).is_file()) - assert not missing, ( - "release gate checklist cites tests that do not exist on disk; " - "rename the checklist row to a real file or restore the test: " - f"{missing}" - ) - # Tighten: every cited path must point at a `test_*.py` file, not at - # ``conftest.py`` or a helper module. The leaf-prefix check catches - # typos like ``conftests.py`` and accidental citations of non-test - # support files even though they happen to exist on disk. - non_test = sorted(p for p in cited if not Path(p).name.startswith("test_")) - assert not non_test, ( - "release gate checklist cites paths that do not start with " - "``test_``; the checklist should point at regression tests, not " - f"conftest or helper modules: {non_test}" - ) - - -# --------------------------------------------------------------------------- # -# Gate 2: every public tier key appears in the checklist. # -# --------------------------------------------------------------------------- # - -# Tiers that release notes are allowed to make promises about. ``stable`` -# and ``advanced`` features must show up in the checklist so a reader can -# tell what the release covers. ``experimental`` and ``internal_only`` -# are deliberately excluded -- the checklist's prose tags them as -# not-promised, so a missing row for those tiers is not a release gate -# failure. Codec keys are handled together as a group in the -# local-read/write section, so the gate excludes them from the -# per-key enumeration. -_PROMISED_TIERS = {"stable", "advanced"} - - -def _checklist_mentions(text: str, key: str) -> bool: - """``key`` is something like ``reader.local_file``. Match either the - bare key or the key as a ``SUPPORTED_FEATURES['key']`` lookup.""" - if key in text: - return True - return f"SUPPORTED_FEATURES['{key}']" in text - - -def test_release_gate_lists_every_promised_supported_feature() -> None: - text = _checklist_text() - missing = [] - for key, tier in SUPPORTED_FEATURES.items(): - if tier not in _PROMISED_TIERS: - continue - if key.startswith("codec."): - # Codecs are grouped, not enumerated per-row. - continue - if not _checklist_mentions(text, key): - missing.append((key, tier)) - assert not missing, ( - "promised SUPPORTED_FEATURES entries are missing from the release " - "gate checklist; add a row (or update SUPPORTED_FEATURES) so the " - "release notes can quote the tier: " - f"{missing}" - ) - - -# --------------------------------------------------------------------------- # -# Gate 3: HTTP SSRF presence gate. # -# --------------------------------------------------------------------------- # - - -def test_release_gate_http_ssrf_rejects_loopback() -> None: - """The checklist promises that HTTP URLs targeting loopback hosts - raise :class:`UnsafeURLError`. Lock that promise here so the row in - the checklist always points at a passing test rather than at prose. - """ - with pytest.raises(UnsafeURLError): - # No network call -- the SSRF check rejects before the socket - # opens, so this test is offline-safe. - open_geotiff("http://127.0.0.1/does-not-matter.tif") - - -@pytest.mark.xfail( - reason=( - "Locks in once sub-PR 5 of #2321 (PR #2326) lands. Until then, " - "uppercase HTTP slips past the SSRF check and falls through to " - "fsspec, which raises a generic ValueError. Once #2326 is merged, " - "remove this xfail marker so the release gate enforces the promise." - ), - strict=False, - # Narrow to the two known shapes today (fsspec ValueError) and the - # post-#2326 shape (UnsafeURLError). A future regression that raises - # anything else (RuntimeError, OSError from a real socket dial, etc.) - # should NOT silently xfail -- it should fail loudly. - raises=(ValueError, UnsafeURLError), -) -def test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme() -> None: - """Uppercase scheme (sub-PR 5 of #2321) must take the same SSRF - path. If this test ever skips silently or routes through fsspec, - the checklist's HTTP row is lying. - """ - with pytest.raises(UnsafeURLError): - open_geotiff("HTTP://127.0.0.1/does-not-matter.tif") - - -# --------------------------------------------------------------------------- # -# Gate 4: VRT rows point at non-empty test files. # -# --------------------------------------------------------------------------- # - - -def _vrt_section_test_files() -> set[str]: - """Return the test files cited inside the "VRT supported subset" - section of the checklist.""" - text = _checklist_text() - start = text.find("VRT supported subset") - assert start != -1, "checklist is missing the VRT supported subset section" - end = text.find("Sidecar and overview interactions", start) - if end == -1: - end = len(text) - section = text[start:end] - return set(_TEST_PATH_RE.findall(section)) - - -def test_release_gate_vrt_rows_point_at_real_test_functions() -> None: - files = _vrt_section_test_files() - assert files, "no VRT test files cited in the checklist" - self_ref = "xrspatial/geotiff/tests/test_release_gate_2321.py" - empty = [] - for rel in sorted(files): - if rel == self_ref: - continue - path = _REPO_ROOT / rel - if not path.is_file(): - # Caught by gate 1; skip here. - continue - body = path.read_text(encoding="utf-8") - if "def test_" not in body: - empty.append(rel) - assert not empty, ( - "VRT checklist rows cite files with no test functions; either the " - "file was emptied or the row should be removed: " - f"{empty}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_attrs_contract.py b/xrspatial/geotiff/tests/test_release_gate_attrs_contract.py deleted file mode 100644 index c997ff9a..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_attrs_contract.py +++ /dev/null @@ -1,154 +0,0 @@ -"""Release gate: CRS / transform / nodata attrs contract (epic #2340). - -The canonical attrs after a GeoTIFF read are tagged ``stable`` in the -release gate checklist. The contract: every georeferenced read produces -a DataArray whose ``attrs`` carry, at minimum, ``crs``, ``crs_wkt``, -``transform``, ``georef_status``, the contract version stamp, and (when -declared) ``nodata``. These attrs survive a write -> read round trip. - -This file is the single-shot release gate. Deep canonicalisation, -alias handling, contract version bumps, and pass-through semantics are -each covered by their own ``test_attrs_contract_*_1984.py`` files; here -we lock the user-facing names and round-trip stability so the release -notes can quote the canonical attrs without caveats. - -Out of scope: -* Alias handling (``test_attrs_contract_aliases_1984.py``). -* Attrs pass-through for user-supplied keys - (``test_attrs_contract_passthrough_1984.py``). -* Contract version stamp bump policy - (``test_attrs_contract_version_1984.py``). -""" -from __future__ import annotations - -import numpy as np -import pytest - -from xrspatial.geotiff import open_geotiff, to_geotiff -from xrspatial.geotiff._geotags import GeoTransform -from xrspatial.geotiff._writer import write - - -# Keys that release notes are allowed to promise on every georeferenced -# read. Adding a new key to the canonical set is a contract-version -# bump (see issue #1984); removing one is a breaking change. Anything -# else in the attrs (``masked_nodata``, ``nodata_pixels_present``, -# ``raster_type``, etc.) is additive and not pinned here. -CANONICAL_KEYS = ( - "_xrspatial_geotiff_contract", - "crs", - "crs_wkt", - "transform", - "georef_status", -) - - -def _write_known_good(path: str, *, nodata: float | None = None) -> None: - arr = np.arange(16, dtype=np.float32).reshape(4, 4) - gt = GeoTransform( - origin_x=500000.0, - origin_y=4000000.0, - pixel_width=30.0, - pixel_height=-30.0, - ) - write( - arr, - path, - geo_transform=gt, - crs_epsg=32610, - nodata=nodata, - compression="none", - tiled=False, - ) - - -@pytest.mark.release_gate -def test_release_gate_attrs_canonical_keys_present(tmp_path) -> None: - """A georeferenced read carries every canonical attrs key.""" - path = str(tmp_path / "release_gate_attrs_canonical_2340.tif") - _write_known_good(path) - - da = open_geotiff(path) - missing = [k for k in CANONICAL_KEYS if k not in da.attrs] - assert not missing, ( - "release gate: canonical attrs keys missing from a georeferenced " - f"read: {missing}; release notes promise every key in " - f"{list(CANONICAL_KEYS)}" - ) - - -@pytest.mark.release_gate -def test_release_gate_attrs_georef_status_full(tmp_path) -> None: - """A fully-georeferenced read reports ``georef_status='full'``.""" - path = str(tmp_path / "release_gate_attrs_georef_status_2340.tif") - _write_known_good(path) - - da = open_geotiff(path) - status = da.attrs.get("georef_status") - assert status == "full", ( - f"release gate: a CRS+transform read should report " - f"``georef_status='full'``; got {status!r}. The five canonical " - "georef_status values are the contract downstream code branches on" - ) - - -@pytest.mark.release_gate -def test_release_gate_attrs_contract_version_is_int(tmp_path) -> None: - """``attrs['_xrspatial_geotiff_contract']`` is an int. - - The contract version is the downstream signal for which attrs - shape the array carries. A drift from int to string (or to a - Python object) would silently break callers that compare versions. - """ - path = str(tmp_path / "release_gate_attrs_contract_version_2340.tif") - _write_known_good(path) - - da = open_geotiff(path) - version = da.attrs.get("_xrspatial_geotiff_contract") - assert isinstance(version, int), ( - f"release gate: contract version stamp is not int: type=" - f"{type(version).__name__}, value={version!r}" - ) - assert version >= 1, ( - f"release gate: contract version stamp is non-positive: {version!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_attrs_round_trip_preserves_crs_transform_nodata( - tmp_path, -) -> None: - """Canonical attrs survive a full ``write -> read -> write -> read`` cycle.""" - src = str(tmp_path / "release_gate_attrs_rt_src_2340.tif") - _write_known_good(src, nodata=-9999.0) - - first = open_geotiff(src) - crs_first = int(first.attrs["crs"]) - transform_first = tuple(first.attrs["transform"]) - nodata_first = float(first.attrs["nodata"]) - - # Round-trip through the public writer. - rewrite = str(tmp_path / "release_gate_attrs_rt_rewrite_2340.tif") - to_geotiff(first, rewrite, compression="none", tiled=False) - - second = open_geotiff(rewrite) - assert int(second.attrs["crs"]) == crs_first, ( - f"release gate: CRS drifted across round-trip: {crs_first} -> " - f"{second.attrs['crs']!r}" - ) - transform_second = tuple(second.attrs["transform"]) - assert len(transform_second) == 6, ( - f"release gate: transform reshaped across round-trip: " - f"{transform_second!r}" - ) - for got, want in zip(transform_second, transform_first): - assert got == pytest.approx(want, abs=1e-12, rel=1e-12), ( - f"release gate: transform drifted across round-trip: " - f"{transform_first!r} -> {transform_second!r}" - ) - assert float(second.attrs["nodata"]) == pytest.approx( - nodata_first, abs=0.0 - ), ( - f"release gate: nodata drifted across round-trip: " - f"{nodata_first} -> {second.attrs['nodata']!r}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_codec_round_trip_2341.py b/xrspatial/geotiff/tests/test_release_gate_codec_round_trip_2341.py deleted file mode 100644 index d0e08c79..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_codec_round_trip_2341.py +++ /dev/null @@ -1,375 +0,0 @@ -"""Release gate: stable-codec read/write/read round-trip (epic #2341). - -PR 4 of 5 of epic #2341. The release contract names a specific set of -codecs as ``stable`` in :data:`xrspatial.geotiff.SUPPORTED_FEATURES`: -``none``, ``deflate``, ``lzw``, ``zstd``, ``packbits``. The release -notes promise that on any of these codecs, a round-trip preserves both -bit-exact pixels AND every canonical release attr key, on every dtype -the library promises to round-trip. - -Existing tests split the contract: - -* ``test_compression.py`` covers codec internals (LZW dictionary edge - cases, PackBits boundary cases, deflate stream framing). -* ``test_supported_features_tiers_2137.py`` pins the - ``SUPPORTED_FEATURES`` tier table. -* ``test_release_gate_codecs.py`` pins lossless pixel round-trip for - two dtypes (``uint16``, ``float32``). - -This file is the joint gate: the cartesian product of every stable -codec with every promised dtype, asserting both pixel equality AND -release-attr equality through a full read/write/read cycle. - -Out of scope: - -* Experimental codecs (``lerc``, ``jpeg2000``, ``j2k``, ``lz4``) -- - release tier is ``experimental``; covered by - ``test_supported_features_tiers_2137.py``. -* Internal-only ``jpeg`` -- not part of the public surface. -* COG layout (``test_release_gate_cog.py``). -* Backend parity (``test_backend_parity_matrix.py``). -""" -from __future__ import annotations - -import uuid - -import numpy as np -import pytest -import xarray as xr - -from xrspatial.geotiff import SUPPORTED_FEATURES, open_geotiff, to_geotiff -from xrspatial.geotiff._compression import (COMPRESSION_DEFLATE, COMPRESSION_LZW, - COMPRESSION_NONE, COMPRESSION_PACKBITS, - COMPRESSION_ZSTD) -from xrspatial.geotiff._header import parse_header, parse_ifd - -# The stable lossless codec set. Kept in lockstep with the ``codec.*`` -# entries tiered ``stable`` in -# :data:`xrspatial.geotiff.SUPPORTED_FEATURES`. The drift guard at the -# bottom of this file fails the build if the two sets disagree. -STABLE_CODECS = ("none", "deflate", "lzw", "zstd", "packbits") - -# The dtype set the release contract promises to round-trip through -# every stable codec. ``int16`` and ``int32`` exercise the signed -# integer path; ``float32`` and ``float64`` exercise the IEEE float -# path with NaN as the nodata sentinel. -DTYPES = ("int16", "int32", "float32", "float64") - -# TIFF tag value the on-disk file should carry for each stable codec -# name. The reader IFD parser exposes ``ifd.compression`` so we can -# assert the on-disk tag without depending on a high-level -# ``attrs['compression']`` key (none exists; see issue #2341). -_CODEC_TO_TIFF_TAG = { - "none": COMPRESSION_NONE, - "deflate": COMPRESSION_DEFLATE, - "lzw": COMPRESSION_LZW, - "zstd": COMPRESSION_ZSTD, - "packbits": COMPRESSION_PACKBITS, -} - -# Per-dtype integer nodata sentinel. Float dtypes use NaN. The -# integer sentinels are well outside the natural value range of the -# fixture below (small ascending integers) so the sentinel never -# collides with a real pixel. -_INT_NODATA = { - "int16": np.int16(-32768), - "int32": np.int32(-2147483648), -} - -# Release-attr keys the cartesian-product gate asserts on. These come -# from the issue body (#2341) and from the canonical attrs the reader -# emits (see ``test_release_gate_attrs_contract.py``). ``raster_type`` -# is included even though it is only emitted when the source was -# ``RasterPixelIsPoint``; we use a small fixture that defaults to -# ``'area'`` so it is normalized below in ``_canonical_attrs``. -_RELEASE_ATTR_KEYS = ( - "transform", - "crs", - "crs_wkt", - "nodata", - "masked_nodata", - "georef_status", - "raster_type", -) - - -def _make_input(dtype_name: str) -> xr.DataArray: - """Build a 128x128 DataArray of the given dtype. - - Float arrays seed a NaN sentinel at (0, 0); integer arrays seed - the per-dtype sentinel at (0, 0). The remaining pixels are a - deterministic, non-trivial pattern so a per-axis flip or stride - bug surfaces as a pixel mismatch. - """ - dtype = np.dtype(dtype_name) - height, width = 128, 128 - n = height * width - if np.issubdtype(dtype, np.floating): - arr = np.linspace(-100.0, 100.0, n, dtype=dtype).reshape(height, width) - arr[0, 0] = np.nan - nodata: float | int = float("nan") - else: - # Small positive ramp so the dtype min sentinel never collides - # with a real pixel. The ramp climbs to ``n - 1 == 16383`` with - # the 128*128 fixture, which fits in ``int16`` (max 32767). If - # a future dtype with a smaller positive range is added (e.g. - # ``int8``) the ramp would wrap and collide with the sentinel; - # cap the ramp or shrink the fixture in that case. - arr = np.arange(n, dtype=dtype).reshape(height, width) - sentinel = _INT_NODATA[dtype_name] - arr[0, 0] = sentinel - nodata = sentinel - - # 30 m pixels with a descending y axis (top-left at the highest y - # coord). The writer turns these into a GeoTransform of - # ``(30, 0, origin_x, 0, -30, origin_y)``. - y = 4000000.0 - 30.0 * (np.arange(height) + 0.5) - x = 500000.0 + 30.0 * (np.arange(width) + 0.5) - attrs: dict = {"crs": 32610, "nodata": nodata} - return xr.DataArray( - arr, - dims=("y", "x"), - coords={"y": y, "x": x}, - attrs=attrs, - ) - - -def _canonical_attrs(da: xr.DataArray) -> dict: - """Project a DataArray's ``attrs`` onto the release-attr key set. - - ``raster_type`` is missing from ``attrs`` for the default ``area`` - raster (the writer only stamps ``'point'`` explicitly); normalize - here so the cross-read comparison can treat the missing key as - equivalent to ``'area'``. - """ - out = {} - for key in _RELEASE_ATTR_KEYS: - if key == "raster_type": - out[key] = da.attrs.get("raster_type", "area") - else: - out[key] = da.attrs.get(key) - return out - - -def _read_tiff_compression_tag(path: str) -> int: - """Read the on-disk TIFF Compression tag from the first IFD. - - The reader's high-level API does not surface ``attrs['compression']`` - (issue #2341 question). Inspect the IFD directly so the test pins - the actual on-disk codec choice rather than relying on the - DataArray attrs the reader emits. - """ - with open(path, "rb") as fh: - data = fh.read() - header = parse_header(data) - ifd = parse_ifd(data, header.first_ifd_offset, header) - return ifd.compression - - -def _assert_pixels_equal(actual: np.ndarray, expected: np.ndarray, - *, codec: str, dtype_name: str) -> None: - """NaN-aware byte-exact pixel comparison. - - The float path uses ``equal_nan=True`` so the NaN sentinel - matches NaN-to-NaN. The integer path uses strict - ``array_equal`` -- the sentinel is just another integer value - and must round-trip bit-exact. - """ - assert actual.shape == expected.shape, ( - f"release gate (#2341): codec {codec!r} dtype {dtype_name!r} " - f"reshaped the array across the round-trip: " - f"{expected.shape} -> {actual.shape}" - ) - assert actual.dtype == expected.dtype, ( - f"release gate (#2341): codec {codec!r} promoted dtype " - f"{dtype_name!r} to {actual.dtype!r} across the round-trip" - ) - if np.issubdtype(expected.dtype, np.floating): - equal = np.array_equal(actual, expected, equal_nan=True) - else: - equal = np.array_equal(actual, expected) - if not equal: - # Surface the first divergent pixel so a debug session can - # jump straight to the offending tile / row. - if np.issubdtype(expected.dtype, np.floating): - mismatch_mask = ~( - (actual == expected) | (np.isnan(actual) & np.isnan(expected)) - ) - else: - mismatch_mask = actual != expected - first = np.argwhere(mismatch_mask) - first_idx = tuple(int(v) for v in first[0]) if first.size else None - first_actual = ( - actual[first_idx] if first_idx is not None else None - ) - first_expected = ( - expected[first_idx] if first_idx is not None else None - ) - raise AssertionError( - f"release gate (#2341): codec {codec!r} did not preserve " - f"{dtype_name!r} pixels byte-for-byte; the release contract " - f"names this codec as lossless for this dtype. First " - f"divergence at index {first_idx!r}: actual=" - f"{first_actual!r}, expected={first_expected!r}" - ) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("dtype_name", DTYPES) -@pytest.mark.parametrize("codec", STABLE_CODECS) -def test_release_gate_codec_round_trip(tmp_path, codec, dtype_name) -> None: - """Stable codec * dtype: pixels and release attrs survive a full - read/write/read cycle. - - Steps: - - 1. Build an in-memory DataArray with a known transform, CRS, and - nodata sentinel (NaN for float; per-dtype int min for int). - 2. Write via ``to_geotiff(path, compression=codec)``. - 3. Read back via ``open_geotiff(path)`` -- this is the canonical - baseline. The reader fills in ``crs_wkt``, - ``georef_status``, ``masked_nodata``, etc. - 4. Write the baseline DataArray to a second path under the same - codec. - 5. Read the second path back; assert byte-exact pixels and every - release-attr key matches the baseline. - - The two-pass shape is what makes this a *round-trip* gate - rather than a single-pass write-and-read gate: the canonical - attrs themselves have to survive the second cycle, not just the - first. - """ - # Unique tag per parametrized case so parallel pytest workers and - # parallel rockout worktrees never collide on the same tmp file. - nonce = uuid.uuid4().hex[:8] - write_first = str( - tmp_path - / f"release_gate_2341_{codec}_{dtype_name}_first_{nonce}.tif" - ) - write_second = str( - tmp_path - / f"release_gate_2341_{codec}_{dtype_name}_second_{nonce}.tif" - ) - - source = _make_input(dtype_name) - is_float = np.issubdtype(np.dtype(dtype_name), np.floating) - - # The masking behaviour differs by dtype: integer reads default to - # masking the sentinel into NaN (which would change dtype and break - # the byte-exact comparison), so we read integers with - # ``mask_nodata=False`` to keep the sentinel as a real pixel. - # Float reads round-trip NaN as NaN regardless of mask_nodata. - mask_kwargs: dict = {} if is_float else {"mask_nodata": False} - - # Pass 1: write the in-memory source. The writer infers NaN as the - # implicit float sentinel without a ``nodata=`` kwarg, so only the - # integer branch passes one explicitly. This keeps the test from - # locking the writer into accepting ``nodata=NaN`` if that ever - # becomes a no-op or a rejected redundancy. - pass_one_kwargs: dict = ( - {} if is_float else {"nodata": source.attrs["nodata"]} - ) - to_geotiff( - source, - write_first, - compression=codec, - tiled=False, - **pass_one_kwargs, - ) - - baseline = open_geotiff(write_first, **mask_kwargs) - baseline_pixels = np.asarray(baseline.values) - baseline_attrs = _canonical_attrs(baseline) - - # The on-disk TIFF Compression tag must reflect the requested codec. - tag_first = _read_tiff_compression_tag(write_first) - assert tag_first == _CODEC_TO_TIFF_TAG[codec], ( - f"release gate (#2341): codec {codec!r} encoded as TIFF tag " - f"{tag_first} on first write; expected " - f"{_CODEC_TO_TIFF_TAG[codec]} per the codec -> tag map" - ) - - # Pass 2: rewrite the baseline DataArray under the same codec. - # The baseline DataArray already carries ``attrs['nodata']`` from - # the first read; the writer picks the sentinel up from the attrs - # on the float path. For the integer branch we pass the sentinel - # explicitly so the writer does not need to fall back to a default. - pass_two_kwargs: dict = ( - {} if is_float else {"nodata": baseline.attrs.get("nodata")} - ) - to_geotiff( - baseline, - write_second, - compression=codec, - tiled=False, - **pass_two_kwargs, - ) - - second = open_geotiff(write_second, **mask_kwargs) - second_pixels = np.asarray(second.values) - second_attrs = _canonical_attrs(second) - - tag_second = _read_tiff_compression_tag(write_second) - assert tag_second == _CODEC_TO_TIFF_TAG[codec], ( - f"release gate (#2341): codec {codec!r} encoded as TIFF tag " - f"{tag_second} on the second write; expected " - f"{_CODEC_TO_TIFF_TAG[codec]} per the codec -> tag map" - ) - - _assert_pixels_equal( - second_pixels, baseline_pixels, codec=codec, dtype_name=dtype_name, - ) - - # Per-attribute comparison so a single failing key reports which - # attr drifted instead of a wholesale dict-equality failure. - for key in _RELEASE_ATTR_KEYS: - want = baseline_attrs[key] - got = second_attrs[key] - if key == "nodata" and isinstance(want, float) and np.isnan(want): - assert isinstance(got, float) and np.isnan(got), ( - f"release gate (#2341): codec {codec!r} dtype " - f"{dtype_name!r} dropped NaN nodata across the " - f"round-trip: got {got!r}" - ) - continue - if key == "transform": - assert want is not None and got is not None, ( - f"release gate (#2341): codec {codec!r} dtype " - f"{dtype_name!r} dropped ``attrs['transform']``: " - f"{want!r} -> {got!r}" - ) - assert tuple(got) == tuple(want), ( - f"release gate (#2341): codec {codec!r} dtype " - f"{dtype_name!r} drifted ``attrs['transform']``: " - f"{want!r} -> {got!r}" - ) - continue - assert got == want, ( - f"release gate (#2341): codec {codec!r} dtype {dtype_name!r} " - f"drifted ``attrs[{key!r}]`` across the round-trip: " - f"{want!r} -> {got!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_codec_round_trip_stable_set_matches_supported_features() -> None: - """The codec list in this file matches ``SUPPORTED_FEATURES``. - - If a codec is promoted into ``stable`` (or demoted out) in - :data:`xrspatial.geotiff.SUPPORTED_FEATURES` without updating - this file, the cartesian-product gate is silently out of sync - with the runtime tier table. Fail loudly here so the PR that - changes the tier also updates the gate. - """ - stable_from_constant = { - key.split(".", 1)[1] - for key, tier in SUPPORTED_FEATURES.items() - if key.startswith("codec.") and tier == "stable" - } - assert stable_from_constant == set(STABLE_CODECS), ( - "release gate (#2341): STABLE_CODECS drifted from " - "SUPPORTED_FEATURES; the gate and the runtime tier table " - "must agree on which codecs are stable. " - f"constant: {set(STABLE_CODECS)!r}; " - f"SUPPORTED_FEATURES: {stable_from_constant!r}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_codecs.py b/xrspatial/geotiff/tests/test_release_gate_codecs.py deleted file mode 100644 index b7e96ee0..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_codecs.py +++ /dev/null @@ -1,132 +0,0 @@ -"""Release gate: stable lossless codec round-trip (epic #2340). - -The release contract for the GeoTIFF module names a specific set of -lossless codecs as ``stable``: ``none``, ``deflate``, ``lzw``, -``packbits``, ``zstd``. Every one of them must round-trip pixels -byte-for-byte through ``to_geotiff`` -> ``open_geotiff`` on both -integer and float dtypes. - -This file is the per-codec gate: one parametrized test per dtype that -walks every stable codec. The fine-grained codec internals (LZW -dictionary edge cases, PackBits boundary cases, deflate stream framing, -etc.) live in their dedicated test files; here we only assert the -end-to-end public-API promise. - -Out of scope: experimental codecs (``lerc``, ``jpeg2000``, ``j2k``, -``lz4``), the internal-only ``jpeg`` codec, and the COG layout gate -(see ``test_release_gate_cog.py``). -""" -from __future__ import annotations - -import numpy as np -import pytest - -from xrspatial.geotiff import SUPPORTED_FEATURES, open_geotiff -from xrspatial.geotiff._geotags import GeoTransform -from xrspatial.geotiff._writer import write - - -# The stable lossless codec set. Keep this list in lockstep with the -# ``codec.*`` entries tiered ``stable`` in -# :data:`xrspatial.geotiff.SUPPORTED_FEATURES`. If a codec is promoted -# into or out of stable, add or remove it here -- the gate is meant -# to lock the public-facing list. -STABLE_LOSSLESS_CODECS = ("none", "deflate", "lzw", "packbits", "zstd") - - -def _gt() -> GeoTransform: - return GeoTransform( - origin_x=500000.0, - origin_y=4000000.0, - pixel_width=30.0, - pixel_height=-30.0, - ) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) -def test_release_gate_codec_round_trip_uint16(tmp_path, codec) -> None: - """Integer pixel bytes survive every stable lossless codec.""" - arr = np.arange(64, dtype=np.uint16).reshape(8, 8) - path = str(tmp_path / f"release_gate_codec_{codec}_uint16_2340.tif") - write( - arr, - path, - geo_transform=_gt(), - crs_epsg=32610, - compression=codec, - tiled=False, - ) - - out = open_geotiff(path) - assert out.dtype == np.uint16, ( - f"release gate: codec {codec!r} promoted uint16 to {out.dtype!r}; " - "the lossless contract is that integer dtypes survive every " - "stable codec" - ) - np.testing.assert_array_equal( - np.asarray(out.values), - arr, - err_msg=( - f"release gate: codec {codec!r} did not round-trip uint16 " - "pixels byte-for-byte; the release contract names this codec " - "as lossless" - ), - ) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) -def test_release_gate_codec_round_trip_float32(tmp_path, codec) -> None: - """Float pixel bytes survive every stable lossless codec.""" - # Use a deterministic but non-trivial pattern so a per-axis flip - # or per-row stride bug still fails. - arr = np.linspace(-100.0, 100.0, 64, dtype=np.float32).reshape(8, 8) - path = str(tmp_path / f"release_gate_codec_{codec}_float32_2340.tif") - write( - arr, - path, - geo_transform=_gt(), - crs_epsg=32610, - compression=codec, - tiled=False, - ) - - out = open_geotiff(path) - assert out.dtype == np.float32, ( - f"release gate: codec {codec!r} promoted float32 to " - f"{out.dtype!r}" - ) - np.testing.assert_array_equal( - np.asarray(out.values), - arr, - err_msg=( - f"release gate: codec {codec!r} did not round-trip float32 " - "pixels byte-for-byte; the release contract names this codec " - "as lossless" - ), - ) - - -@pytest.mark.release_gate -def test_release_gate_codec_stable_set_matches_supported_features() -> None: - """The stable codec list in this file matches ``SUPPORTED_FEATURES``. - - If a codec is promoted into ``stable`` (or demoted out) in - :data:`xrspatial.geotiff.SUPPORTED_FEATURES` without updating this - file, the release gate is out of sync with the runtime contract. - Fail loudly here so the PR that changes the tier also updates the - gate. - """ - stable_from_constant = { - key.split(".", 1)[1] - for key, tier in SUPPORTED_FEATURES.items() - if key.startswith("codec.") and tier == "stable" - } - assert stable_from_constant == set(STABLE_LOSSLESS_CODECS), ( - "release gate: STABLE_LOSSLESS_CODECS drifted from " - "SUPPORTED_FEATURES; the gate and the runtime tier table must " - "agree on which codecs are stable. " - f"constant: {set(STABLE_LOSSLESS_CODECS)!r}; " - f"SUPPORTED_FEATURES: {stable_from_constant!r}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_cog.py b/xrspatial/geotiff/tests/test_release_gate_cog.py deleted file mode 100644 index e58c878e..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_cog.py +++ /dev/null @@ -1,160 +0,0 @@ -"""Release gate: COG write and read for stable lossless codecs (epic #2340). - -The release contract tags ``writer.cog`` and ``reader.local_cog`` as -``stable`` in :data:`xrspatial.geotiff.SUPPORTED_FEATURES`. The promise -is: ``to_geotiff(cog=True, compression=)`` writes a -file that ``open_geotiff`` reads back bit-exact, with CRS, transform, -and (when declared) nodata preserved across every stable codec. - -This gate parametrizes the codec axis so a single regression in any -stable codec on the COG path fails noisily. The COG layout itself -(IFD-first, tiled, internal overviews) is exhaustively pinned by -``test_cog_writer_compliance.py`` and ``test_cog_parity_2286.py``; the -release-gate gate is the small end-to-end shape every release needs. - -Out of scope here: -* COG spec compliance details -- see ``test_cog_writer_compliance.py``. -* HTTP COG range reads -- ``reader.http_cog`` is ``advanced`` (not - stable), so it is not part of this gate. -* BigTIFF COG -- ``writer.bigtiff_cog`` is ``advanced``. -""" -from __future__ import annotations - -import numpy as np -import pytest -import xarray as xr - -from xrspatial.geotiff import open_geotiff, to_geotiff - -# Import the stable lossless set from the sibling release-gate file -# rather than redefining it. The cross-check against -# ``SUPPORTED_FEATURES`` lives in that file; reusing the same tuple -# here means a tier change in ``_attrs.py`` cannot leave the COG gate -# parametrized on a stale list. -from xrspatial.geotiff.tests.test_release_gate_codecs import ( # noqa: E402 - STABLE_LOSSLESS_CODECS, -) - -# COG requires a tiled internal layout and benefits from a slightly -# larger raster than the plain-file gate so the writer can emit a real -# tile grid rather than a single 1-tile file. Sticking to 32x32 keeps -# the test fast (well under 1 ms for the codec loop) while still -# exercising multiple tiles. -_W = 32 -_H = 32 - - -def _make_data_array(*, nodata: float | None = None) -> xr.DataArray: - pixels = np.arange(_H * _W, dtype=np.float32).reshape(_H, _W) - # Pixel-center coords, 30 m pixels, top-left at (500000, 4000000). - y = np.array( - [4000000.0 - 15.0 - 30.0 * i for i in range(_H)], - dtype=np.float64, - ) - x = np.array( - [500000.0 + 15.0 + 30.0 * i for i in range(_W)], - dtype=np.float64, - ) - attrs: dict = {"crs": 32610} - if nodata is not None: - attrs["nodata"] = nodata - return xr.DataArray( - pixels, - dims=("y", "x"), - coords={"y": y, "x": x}, - attrs=attrs, - ) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) -def test_release_gate_cog_round_trips_pixels(tmp_path, codec) -> None: - """COG write -> read returns the same pixels under every stable codec.""" - da = _make_data_array() - path = str(tmp_path / f"release_gate_cog_{codec}_pixels_2340.tif") - to_geotiff( - da, - path, - compression=codec, - cog=True, - tiled=True, - tile_size=16, - ) - - out = open_geotiff(path) - assert out.dtype == np.float32, ( - f"release gate: COG with codec {codec!r} promoted dtype to " - f"{out.dtype!r}" - ) - np.testing.assert_array_equal( - np.asarray(out.values), - np.asarray(da.values), - err_msg=( - f"release gate: COG with codec {codec!r} did not round-trip " - "pixels byte-for-byte" - ), - ) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) -def test_release_gate_cog_preserves_crs_transform(tmp_path, codec) -> None: - """CRS and transform survive the COG write -> read for every stable codec.""" - da = _make_data_array() - path = str(tmp_path / f"release_gate_cog_{codec}_attrs_2340.tif") - to_geotiff( - da, - path, - compression=codec, - cog=True, - tiled=True, - tile_size=16, - ) - - out = open_geotiff(path) - crs = out.attrs.get("crs") - assert crs is not None and int(crs) == 32610, ( - f"release gate: COG with codec {codec!r} dropped or drifted " - f"``attrs['crs']``: got {crs!r}" - ) - transform = out.attrs.get("transform") - assert transform is not None and len(transform) == 6, ( - f"release gate: COG with codec {codec!r} dropped or reshaped " - f"``attrs['transform']``: got {transform!r}" - ) - assert transform[0] == pytest.approx(30.0, abs=1e-9), ( - f"release gate: COG pixel_width drifted under {codec!r}: " - f"{transform!r}" - ) - assert transform[4] == pytest.approx(-30.0, abs=1e-9), ( - f"release gate: COG pixel_height drifted under {codec!r}: " - f"{transform!r}" - ) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("codec", STABLE_LOSSLESS_CODECS) -def test_release_gate_cog_preserves_nodata(tmp_path, codec) -> None: - """A declared nodata sentinel survives COG write -> read under every codec.""" - sentinel = -9999.0 - da = _make_data_array(nodata=sentinel) - path = str(tmp_path / f"release_gate_cog_{codec}_nodata_2340.tif") - to_geotiff( - da, - path, - compression=codec, - nodata=sentinel, - cog=True, - tiled=True, - tile_size=16, - ) - - out = open_geotiff(path) - nodata = out.attrs.get("nodata") - assert nodata is not None, ( - f"release gate: COG with codec {codec!r} dropped declared nodata" - ) - assert float(nodata) == pytest.approx(sentinel, abs=0.0), ( - f"release gate: COG with codec {codec!r} drifted nodata from " - f"{sentinel} to {nodata!r}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_dask_parity.py b/xrspatial/geotiff/tests/test_release_gate_dask_parity.py deleted file mode 100644 index 6b557206..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_dask_parity.py +++ /dev/null @@ -1,150 +0,0 @@ -"""Release gate: dask read parity vs eager (epic #2340). - -Dask reads of a local GeoTIFF must return the same pixels and the same -canonical attrs as the eager (numpy) read. This is the -``reader.local_file`` stable promise extended to the dask backend. - -The release gate locks the small, deterministic case a release engineer -can run before tagging: write a known-good file, read it both eagerly -and through the dask backend, and assert the pixel-level and attrs -parity. The wide backend matrix -(``test_backend_pixel_parity_matrix_1813.py``, -``test_backend_parity_matrix.py``) exercises every codec / chunk-size / -dtype combination -- those stay the canonical parity suite. The -release-gate test is the one-shot the release notes can quote without -caveats. - -Out of scope: -* GPU / cupy parity (``reader.gpu`` is ``experimental``, not stable). -* VRT lazy reads (``reader.vrt`` is ``advanced``). -* COG dask reads (covered by ``test_release_gate_cog.py`` via the - eager reader; the dask parity for COG is part of the canonical - parity matrix). -""" -from __future__ import annotations - -import numpy as np -import pytest - -# Every test in this file exercises the ``chunks=`` dask backend. Skip -# the whole file if dask is not installed -- the parity claim is -# vacuous without the backend it compares against. -pytest.importorskip("dask") - -from xrspatial.geotiff import open_geotiff # noqa: E402 -from xrspatial.geotiff._geotags import GeoTransform # noqa: E402 -from xrspatial.geotiff._writer import write # noqa: E402 - - -def _write_known_good(path: str) -> np.ndarray: - """Write a small tiled GeoTIFF and return the source array.""" - arr = np.arange(256, dtype=np.float32).reshape(16, 16) - gt = GeoTransform( - origin_x=500000.0, - origin_y=4000000.0, - pixel_width=30.0, - pixel_height=-30.0, - ) - write( - arr, - path, - geo_transform=gt, - crs_epsg=32610, - compression="deflate", - tiled=True, - tile_size=16, - ) - return arr - - -@pytest.mark.release_gate -def test_release_gate_dask_read_matches_eager_pixels(tmp_path) -> None: - """The dask backend returns the same pixels as the eager backend.""" - path = str(tmp_path / "release_gate_dask_parity_pixels_2340.tif") - _write_known_good(path) - - eager = open_geotiff(path) - lazy = open_geotiff(path, chunks=8) - - # The dask backend returns a lazy DataArray; materialise it once - # so the equality check is comparing concrete numpy arrays. - lazy_values = np.asarray(lazy.values) - eager_values = np.asarray(eager.values) - np.testing.assert_array_equal( - lazy_values, - eager_values, - err_msg=( - "release gate: dask backend returned different pixels than " - "the eager backend; the release contract promises dask read " - "parity for the local-file stable path" - ), - ) - assert lazy.dtype == eager.dtype, ( - f"release gate: dask backend changed dtype from {eager.dtype!r} " - f"to {lazy.dtype!r}" - ) - assert lazy.shape == eager.shape, ( - f"release gate: dask backend changed shape from {eager.shape!r} " - f"to {lazy.shape!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_dask_read_matches_eager_attrs(tmp_path) -> None: - """The dask backend produces the same canonical attrs as eager.""" - path = str(tmp_path / "release_gate_dask_parity_attrs_2340.tif") - _write_known_good(path) - - eager = open_geotiff(path) - lazy = open_geotiff(path, chunks=8) - - # The canonical attrs the release contract pins; backend-specific - # additive attrs (chunk shape, source URI, etc.) are allowed to - # differ between backends and are not part of this gate. - canonical = ("crs", "transform", "georef_status") - for key in canonical: - assert key in eager.attrs, ( - f"release gate: eager read is missing canonical attr " - f"{key!r}; cannot compare backends" - ) - assert key in lazy.attrs, ( - f"release gate: dask read is missing canonical attr " - f"{key!r}; the release contract requires backend parity on " - "canonical attrs" - ) - eager_v = eager.attrs[key] - lazy_v = lazy.attrs[key] - if key == "transform": - assert len(eager_v) == len(lazy_v) == 6 - for a, b in zip(eager_v, lazy_v): - assert a == pytest.approx(b, abs=1e-12, rel=1e-12), ( - f"release gate: transform drifted across backends: " - f"eager={eager_v!r} lazy={lazy_v!r}" - ) - else: - assert eager_v == lazy_v, ( - f"release gate: ``attrs[{key!r}]`` drifted across " - f"backends: eager={eager_v!r} lazy={lazy_v!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_dask_read_is_lazy(tmp_path) -> None: - """A ``chunks=`` read produces a dask-backed DataArray. - - Without this assertion, a regression that silently materialised - the dask path into numpy could pass the pixel-parity test above - without anyone noticing. The dask backend's defining property is - laziness; pin it. - """ - import dask.array as da_mod - - path = str(tmp_path / "release_gate_dask_parity_lazy_2340.tif") - _write_known_good(path) - - lazy = open_geotiff(path, chunks=8) - assert isinstance(lazy.data, da_mod.Array), ( - f"release gate: chunks= read returned a non-dask array of type " - f"{type(lazy.data).__name__}; the release contract promises a " - "dask-backed DataArray when chunks= is set" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_eager_dask_parity_2341.py b/xrspatial/geotiff/tests/test_release_gate_eager_dask_parity_2341.py deleted file mode 100644 index 326c0651..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_eager_dask_parity_2341.py +++ /dev/null @@ -1,317 +0,0 @@ -"""Release gate: eager-vs-dask raster equivalence (PR 1 of 5 of epic #2341). - -Epic #2341 calls out the highest release risk for the GeoTIFF surface: -pixels matching while ``attrs``, ``coords``, or ``dims`` silently disagree -between the eager (``open_geotiff``) and lazy (``read_geotiff_dask``) -entry points. Today both paths are documented as ``stable``, but no -single regression test asserts full raster equivalence -- pixels + dims + -coords + the seven release-attr keys -- across the two paths on the same -files. - -This module reads each fixture in a representative corpus list once -through ``open_geotiff`` (eager) and once through ``read_geotiff_dask`` -(materialised via ``.compute()``), then asserts: - -* ``.values`` bit-exact (NaN-aware via ``np.array_equal(..., equal_nan=True)``) -* ``.dims`` equal -* ``.coords`` element-wise equal (dtype + bytes match per axis) -* seven release-attr keys equal: - ``transform``, ``crs``, ``crs_wkt``, ``nodata``, ``masked_nodata``, - ``georef_status``, ``raster_type`` - -The assertions are inlined as small helpers in this module. The four -sibling PRs of epic #2341 (windowed-shifted-transform, overview / sidecar -metadata, stable-codec round-trip, ambiguous-metadata negatives) ship -independently; consolidating the helpers into a shared module is a -follow-up once all five have landed and the common shape has settled. - -The corpus covers the four scenarios called out in the issue: - -* integer dtype with explicit integer nodata sentinel -* float dtype with NaN nodata -* MinIsWhite photometric (no explicit nodata tag) -* masked-nodata lifecycle: the same integer-sentinel fixture read with - ``mask_nodata=False`` so the raw uint sentinel branch is pinned in - parity against the default ``mask_nodata=True`` branch (which the - integer-nodata row above already covers) - -Out of scope (sibling PRs of epic #2341): - -* Windowed-read shifted-transform parity (PR 2 of 5). -* Overview / sidecar metadata survival (PR 3 of 5). -* Stable-codec round-trip (PR 4 of 5). -* Negative tests for ambiguous metadata (PR 5 of 5). -""" -from __future__ import annotations - -import pathlib -from typing import Any - -import numpy as np -import pytest -import xarray as xr - -pytest.importorskip("dask") - -from xrspatial.geotiff import open_geotiff, read_geotiff_dask # noqa: E402 - -# Corpus fixtures live under ``golden_corpus/fixtures``; the same -# directory the wider parity matrix and the per-backend golden tests -# already use. -_FIXTURES_DIR = ( - pathlib.Path(__file__).resolve().parent / "golden_corpus" / "fixtures" -) - -# Chunk size for the dask reads. The corpus fixtures used here are -# 64x64 or smaller, so a chunk of 32 produces either a 2x2 chunk grid -# or a single chunk depending on the fixture. Either way the dask -# plumbing fires. -_CHUNK_SIZE = 32 - -# The seven release-attr keys the parity contract pins. Drift on any -# of these between the eager and dask paths is a release blocker; see -# the module docstring for the rationale. -_RELEASE_ATTR_KEYS: tuple[str, ...] = ( - "transform", - "crs", - "crs_wkt", - "nodata", - "masked_nodata", - "georef_status", - "raster_type", -) - - -# --------------------------------------------------------------------------- -# Corpus selection -# --------------------------------------------------------------------------- - -# One ``pytest.param`` per fixture scenario. ``open_kwargs`` carries -# any extra kwargs (e.g. ``mask_nodata=False``) applied to both the -# eager and dask reads so the masked-nodata-lifecycle row exercises -# the same masking semantics on both paths. -_CORPUS = [ - pytest.param( - "nodata_int_sentinel_uint16", - {}, - id="int-dtype-nodata", - ), - pytest.param( - "nodata_nan_float32", - {}, - id="float-dtype-nan-nodata", - ), - pytest.param( - "nodata_miniswhite_uint8", - {}, - id="miniswhite", - ), - # ``mask_nodata=False`` is the contrast cell to the first row's - # default ``mask_nodata=True``: the raw uint16 sentinel is preserved - # and ``masked_nodata`` flips to ``False``. Together the two cells - # pin both sides of the nodata lifecycle on the same fixture, which - # is the silent-disagreement case the issue calls out. - pytest.param( - "nodata_int_sentinel_uint16", - {"mask_nodata": False}, - id="masked-nodata-lifecycle", - ), -] - - -# --------------------------------------------------------------------------- -# Inlined helpers (per issue: no new shared helper module in this PR) -# --------------------------------------------------------------------------- - -def _materialise(da: xr.DataArray) -> np.ndarray: - """Return a host-side numpy view of ``da.values``. - - For an eager numpy-backed DataArray this is a straight ``np.asarray``; - for a dask-backed DataArray ``.values`` triggers ``.compute()`` so - the result is the materialised numpy array. The eager / lazy split - is hidden here so the assertion call sites stay symmetric. Kept as - a named helper (rather than inlined) so the sibling PRs of epic - #2341 can copy the same shape when they land their own gates. - """ - return np.asarray(da.values) - - -def _assert_values_equal(eager: xr.DataArray, lazy: xr.DataArray) -> None: - """Bit-exact NaN-aware comparison of pixel values. - - Integer dtypes go through ``np.array_equal`` directly; float dtypes - use ``equal_nan=True`` so a NaN-marked nodata cell compares equal to - itself across paths. A dtype mismatch fails first with an explicit - message because the float / int divergence is the single most - informative diff when ``mask_nodata=True`` flips a row. - """ - assert eager.dtype == lazy.dtype, ( - f"pixel dtype differs: eager={eager.dtype} lazy={lazy.dtype}" - ) - eager_px = _materialise(eager) - lazy_px = _materialise(lazy) - assert eager_px.shape == lazy_px.shape, ( - f"pixel shape differs: eager={eager_px.shape} lazy={lazy_px.shape}" - ) - equal_nan = eager_px.dtype.kind == "f" - if not np.array_equal(eager_px, lazy_px, equal_nan=equal_nan): - raise AssertionError( - "pixel values differ between eager and dask reads " - f"(dtype={eager_px.dtype}, equal_nan={equal_nan})" - ) - - -def _assert_dims_equal(eager: xr.DataArray, lazy: xr.DataArray) -> None: - """Dims tuple matches exactly between the two paths.""" - assert eager.dims == lazy.dims, ( - f"dims differ: eager={eager.dims!r} lazy={lazy.dims!r}" - ) - - -def _assert_coords_equal(eager: xr.DataArray, lazy: xr.DataArray) -> None: - """Per-axis coord dtype + byte-level equality. - - Coords drive transform reconstruction downstream, so a sub-ULP - divergence still means a different transform. The bytewise compare - catches a dtype-preserving rounding regression that ``allclose`` - would let through. - """ - eager_coord_names = set(eager.coords) - lazy_coord_names = set(lazy.coords) - assert eager_coord_names == lazy_coord_names, ( - f"coord name set differs: " - f"only-in-eager={sorted(eager_coord_names - lazy_coord_names)} " - f"only-in-lazy={sorted(lazy_coord_names - eager_coord_names)}" - ) - for axis in eager_coord_names: - eager_c = np.asarray(eager.coords[axis].values) - lazy_c = np.asarray(lazy.coords[axis].values) - assert eager_c.dtype == lazy_c.dtype, ( - f"coord {axis!r} dtype differs: " - f"eager={eager_c.dtype} lazy={lazy_c.dtype}" - ) - assert eager_c.shape == lazy_c.shape, ( - f"coord {axis!r} shape differs: " - f"eager={eager_c.shape} lazy={lazy_c.shape}" - ) - assert eager_c.tobytes() == lazy_c.tobytes(), ( - f"coord {axis!r} bytes differ between eager and dask reads" - ) - - -def _is_nan_sentinel(value: Any) -> bool: - """True when ``value`` is a NaN, regardless of scalar type. - - ``float('nan') != float('nan')`` by IEEE-754, so the nodata - comparison needs an explicit NaN-aware branch. Accepts python - floats, numpy scalars, and anything castable to ``float``; returns - ``False`` for non-numeric values (including ``None``) so the - caller falls through to the strict ``==`` branch. - """ - if value is None: - return False - try: - return bool(np.isnan(float(value))) - except (TypeError, ValueError): - return False - - -def _attr_equal(a: Any, b: Any) -> bool: - """Compare two attr values, treating NaN as equal to NaN. - - Notable divergence from ``test_backend_full_parity_2211.py``: the - transform 6-tuple of floats is compared bit-exact here (via the - tuple-recursion branch below), where the sibling gate allows a - 1e-9 ULP tolerance. Bit-exact is the contract the issue calls for - on the same-file eager-vs-dask axis; the wider gate has to absorb - a hypothetical future cross-backend float-rounding op (e.g. a GPU - decode path) that does not exist on either of the two paths here. - """ - if _is_nan_sentinel(a) and _is_nan_sentinel(b): - return True - if isinstance(a, np.ndarray) or isinstance(b, np.ndarray): - return ( - isinstance(a, np.ndarray) - and isinstance(b, np.ndarray) - and np.array_equal(a, b) - ) - if isinstance(a, (tuple, list)) and isinstance(b, (tuple, list)): - if len(a) != len(b): - return False - return all(_attr_equal(x, y) for x, y in zip(a, b)) - return a == b - - -def _assert_release_attrs_equal( - eager: xr.DataArray, lazy: xr.DataArray, -) -> None: - """Each of the seven release-attr keys agrees on presence + value. - - An attr absent on the eager read must also be absent on the dask - read, and vice versa. This catches the silent-disagreement case the - issue calls out: pixels and dims line up while one path stamps an - attr the other omits. - """ - for key in _RELEASE_ATTR_KEYS: - in_eager = key in eager.attrs - in_lazy = key in lazy.attrs - assert in_eager == in_lazy, ( - f"release attr {key!r} presence differs: " - f"eager={in_eager} lazy={in_lazy}" - ) - if not in_eager: - continue - eager_v = eager.attrs[key] - lazy_v = lazy.attrs[key] - assert _attr_equal(eager_v, lazy_v), ( - f"release attr {key!r} value differs: " - f"eager={eager_v!r} lazy={lazy_v!r}" - ) - - -# --------------------------------------------------------------------------- -# The parity gate -# --------------------------------------------------------------------------- - -@pytest.mark.release_gate -@pytest.mark.parametrize("fixture_id, open_kwargs", _CORPUS) -def test_release_gate_eager_dask_full_parity( - fixture_id: str, open_kwargs: dict, -) -> None: - """Eager and dask reads of the same file agree on the full contract. - - Reads ``fixture_id`` once via ``open_geotiff`` and once via - ``read_geotiff_dask``, then asserts pixel values, dims, coords, and - the seven release-attr keys all match. The dask result is - materialised via ``.values`` so the comparison is between concrete - arrays, not between graph-vs-array. - """ - path = _FIXTURES_DIR / f"{fixture_id}.tif" - if not path.exists(): - pytest.skip( - f"fixture {fixture_id!r} has no .tif on disk; run " - f"`python -m xrspatial.geotiff.tests.golden_corpus.generate`" - ) - - eager = open_geotiff(str(path), **open_kwargs) - lazy = read_geotiff_dask(str(path), chunks=_CHUNK_SIZE, **open_kwargs) - - _assert_values_equal(eager, lazy) - _assert_dims_equal(eager, lazy) - _assert_coords_equal(eager, lazy) - _assert_release_attrs_equal(eager, lazy) - - -def test_release_gate_corpus_is_non_empty() -> None: - """The corpus list must not silently shrink to zero rows. - - A parametrize argument list that empties out (e.g. a bad refactor - that filters every entry) would cause pytest to collect zero cells - and the matrix would pass vacuously. Pin the row count so a stale - refactor surfaces here instead. - """ - assert len(_CORPUS) == 4, ( - f"corpus row count drifted: expected 4 scenarios " - f"(int-nodata, float-nan-nodata, miniswhite, masked-nodata-lifecycle), " - f"got {len(_CORPUS)}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_local_read.py b/xrspatial/geotiff/tests/test_release_gate_local_read.py deleted file mode 100644 index 3a5b3522..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_local_read.py +++ /dev/null @@ -1,177 +0,0 @@ -"""Release gate: local GeoTIFF read (epic #2340). - -This test pins the most basic promise the GeoTIFF module makes to a user: -``open_geotiff`` reads a local GeoTIFF and the result carries the pixels, -the CRS, the transform, and the nodata sentinel from the file. - -Why a dedicated release gate ----------------------------- -``reader.local_file`` is tagged ``stable`` in -:data:`xrspatial.geotiff.SUPPORTED_FEATURES`. Per epic #2340 every stable -feature needs a release-gate test that fails loudly if the contract -breaks, so a release engineer can run ``pytest -m release_gate`` and -know the next release does not silently regress a stable promise. - -This file is intentionally small. The surrounding test suite already -covers dtype variants, compression codecs, planar layouts, COG layouts, -fuzz cases, and golden-corpus parity. The release gate locks the single -contract a release note can quote without caveats: - -* Pixel bytes survive the read. -* ``attrs['crs']`` round-trips as the source EPSG. -* ``attrs['transform']`` is the 6-tuple GeoTransform the file carried. -* ``attrs['nodata']`` reflects the on-disk sentinel. - -Out of scope: alternative codecs (see ``test_release_gate_codecs.py``), -COG layouts (see ``test_release_gate_cog.py``), windowed reads (see -``test_release_gate_windowed_read.py``), and dask parity (see -``test_release_gate_dask_parity.py``). -""" -from __future__ import annotations - -import numpy as np -import pytest - -from xrspatial.geotiff import open_geotiff -from xrspatial.geotiff._geotags import GeoTransform -from xrspatial.geotiff._writer import write - - -# A tiny axis-aligned grid is enough to lock the contract. Using a -# distinctive pixel pattern (not a constant) means a single-axis drift -# in the writer or reader still fails the equality check. -_PIXELS = np.array( - [ - [10.0, 20.0, 30.0, 40.0], - [11.0, 21.0, 31.0, 41.0], - [12.0, 22.0, 32.0, 42.0], - [13.0, 23.0, 33.0, 43.0], - ], - dtype=np.float32, -) - -# Web Mercator (EPSG:3857) is a common real-world CRS. The transform -# uses positive pixel width and negative pixel height so the y axis -# decreases with row index, which is the convention every reader in -# this project assumes for axis-aligned grids. -_EPSG = 3857 -_ORIGIN_X = 500000.0 -_ORIGIN_Y = 4000000.0 -_PIXEL_W = 30.0 -_PIXEL_H = -30.0 -_EXPECTED_TRANSFORM = (_PIXEL_W, 0.0, _ORIGIN_X, 0.0, _PIXEL_H, _ORIGIN_Y) - - -def _write_known_good(path: str, *, nodata: float | None = None) -> None: - """Write a known-good GeoTIFF with an explicit GeoTransform. - - Uses the lower-level :func:`xrspatial.geotiff._writer.write` so the - transform is emitted from the explicit ``geo_transform`` argument - rather than derived from xarray coords. The release gate locks the - read side; the writer-side coord-to-transform derivation is covered - elsewhere. - """ - gt = GeoTransform( - origin_x=_ORIGIN_X, - origin_y=_ORIGIN_Y, - pixel_width=_PIXEL_W, - pixel_height=_PIXEL_H, - ) - write( - _PIXELS, - path, - geo_transform=gt, - crs_epsg=_EPSG, - nodata=nodata, - compression="none", - tiled=False, - ) - - -@pytest.mark.release_gate -def test_release_gate_local_read_pixels(tmp_path) -> None: - """Pixel bytes survive the read.""" - path = str(tmp_path / "release_gate_local_read_2340.tif") - _write_known_good(path) - - da = open_geotiff(path) - - assert da.dtype == np.float32, ( - f"release gate: local read promoted dtype to {da.dtype!r}; the " - "release contract is that float32 stays float32 unless a " - "nodata sentinel forces promotion" - ) - np.testing.assert_array_equal( - np.asarray(da.values), - _PIXELS, - err_msg=( - "release gate: local read returned different pixels than the " - "writer emitted; the byte-for-byte round trip is the most " - "basic promise the release notes make" - ), - ) - - -@pytest.mark.release_gate -def test_release_gate_local_read_crs(tmp_path) -> None: - """``attrs['crs']`` round-trips as the source EPSG.""" - path = str(tmp_path / "release_gate_local_read_crs_2340.tif") - _write_known_good(path) - - da = open_geotiff(path) - crs = da.attrs.get("crs") - assert crs is not None, ( - "release gate: local read dropped ``attrs['crs']``; the release " - "contract promises that an EPSG-coded source surfaces its CRS" - ) - assert int(crs) == _EPSG, ( - f"release gate: ``attrs['crs']`` drifted from {_EPSG} to " - f"{crs!r}; this changes the release notes contract for " - "``reader.local_file``" - ) - - -@pytest.mark.release_gate -def test_release_gate_local_read_transform(tmp_path) -> None: - """``attrs['transform']`` is the 6-tuple GeoTransform the file carried.""" - path = str(tmp_path / "release_gate_local_read_transform_2340.tif") - _write_known_good(path) - - da = open_geotiff(path) - transform = da.attrs.get("transform") - assert transform is not None, ( - "release gate: local read dropped ``attrs['transform']``; the " - "release contract promises a 6-tuple GeoTransform on every " - "georeferenced read" - ) - assert len(transform) == 6, ( - f"release gate: transform tuple is no longer length 6: " - f"{transform!r}; release notes promise the rasterio-style 6-tuple" - ) - for got, want in zip(transform, _EXPECTED_TRANSFORM): - # Floats compared to float precision because the writer encodes - # the transform as doubles in the GeoTIFF tags. - assert got == pytest.approx(want, abs=1e-12, rel=1e-12), ( - f"release gate: transform tuple drifted: got {transform!r} " - f"want {_EXPECTED_TRANSFORM!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_local_read_nodata(tmp_path) -> None: - """``attrs['nodata']`` reflects the on-disk sentinel.""" - path = str(tmp_path / "release_gate_local_read_nodata_2340.tif") - sentinel = -9999.0 - _write_known_good(path, nodata=sentinel) - - da = open_geotiff(path) - nodata = da.attrs.get("nodata") - assert nodata is not None, ( - "release gate: declared nodata sentinel was dropped on read; " - "the release contract promises that a declared sentinel " - "surfaces in ``attrs['nodata']``" - ) - assert float(nodata) == pytest.approx(sentinel, abs=0.0), ( - f"release gate: ``attrs['nodata']`` drifted from {sentinel} to " - f"{nodata!r}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_local_write.py b/xrspatial/geotiff/tests/test_release_gate_local_write.py deleted file mode 100644 index 95d00e0a..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_local_write.py +++ /dev/null @@ -1,156 +0,0 @@ -"""Release gate: local GeoTIFF write (epic #2340). - -``writer.local_file`` is tagged ``stable`` in -:data:`xrspatial.geotiff.SUPPORTED_FEATURES`. The release contract is: -``to_geotiff`` writes a file that ``open_geotiff`` reads back bit-exact, -with the CRS, transform, and nodata sentinel preserved. - -This gate is small on purpose. The byte-equivalent pixel contract, -attrs canonicalisation, and dtype handling each have their own deep -test files (``test_round_trip_invariants.py``, -``test_attrs_contract_canonical_1984.py``, the matrix tests). The -release-gate test is the one-shot a release engineer can run to know -the most common public-API write -> read flow still works end-to-end. - -Out of scope here: -* Compression codec coverage -- see ``test_release_gate_codecs.py``. -* COG layout -- see ``test_release_gate_cog.py``. -* Detailed attrs canonicalisation -- see - ``test_release_gate_attrs_contract.py``. -""" -from __future__ import annotations - -import numpy as np -import pytest -import xarray as xr - -from xrspatial.geotiff import open_geotiff, to_geotiff - - -def _make_data_array(*, nodata: float | None = None) -> xr.DataArray: - """Build a small DataArray with explicit y/x coords. - - The release contract for ``to_geotiff`` is the public-API path: a - user passes a DataArray with coords, gets back a file whose - GeoTransform reproduces those coords. We keep the grid small (4x4) - so the gate is fast even when run alongside the full release-gate - suite. - """ - pixels = np.array( - [ - [1.0, 2.0, 3.0, 4.0], - [5.0, 6.0, 7.0, 8.0], - [9.0, 10.0, 11.0, 12.0], - [13.0, 14.0, 15.0, 16.0], - ], - dtype=np.float32, - ) - # Pixel-center y/x with width 30 m, origin (500000, 4000000), - # descending y. The writer turns these into a GeoTransform with - # origin at the top-left pixel corner. - y = np.array([3999985.0, 3999955.0, 3999925.0, 3999895.0]) - x = np.array([500015.0, 500045.0, 500075.0, 500105.0]) - attrs: dict = {"crs": 32610} - if nodata is not None: - attrs["nodata"] = nodata - return xr.DataArray( - pixels, - dims=("y", "x"), - coords={"y": y, "x": x}, - attrs=attrs, - ) - - -@pytest.mark.release_gate -def test_release_gate_local_write_round_trips_pixels(tmp_path) -> None: - """``to_geotiff`` writes a file that reads back bit-exact.""" - da = _make_data_array() - path = str(tmp_path / "release_gate_local_write_pixels_2340.tif") - to_geotiff(da, path, compression="none", tiled=False) - - out = open_geotiff(path) - assert out.dtype == np.float32, ( - f"release gate: write -> read flipped dtype to {out.dtype!r}; " - "the release contract promises float32 stays float32 absent a " - "nodata sentinel" - ) - np.testing.assert_array_equal( - np.asarray(out.values), - np.asarray(da.values), - err_msg=( - "release gate: write -> read changed pixel values; " - "to_geotiff is promised to be lossless for the default " - "'none' codec" - ), - ) - - -@pytest.mark.release_gate -def test_release_gate_local_write_preserves_crs(tmp_path) -> None: - """The CRS survives the write -> read round trip.""" - da = _make_data_array() - path = str(tmp_path / "release_gate_local_write_crs_2340.tif") - to_geotiff(da, path, compression="none", tiled=False) - - out = open_geotiff(path) - crs = out.attrs.get("crs") - assert crs is not None, ( - "release gate: write -> read dropped ``attrs['crs']``; the " - "release contract requires the CRS to survive" - ) - assert int(crs) == 32610, ( - f"release gate: ``attrs['crs']`` drifted from 32610 to {crs!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_local_write_preserves_transform(tmp_path) -> None: - """The GeoTransform survives the write -> read round trip.""" - da = _make_data_array() - path = str(tmp_path / "release_gate_local_write_transform_2340.tif") - to_geotiff(da, path, compression="none", tiled=False) - - out = open_geotiff(path) - transform = out.attrs.get("transform") - assert transform is not None, ( - "release gate: write -> read dropped ``attrs['transform']``; " - "the release contract requires the GeoTransform to survive" - ) - assert len(transform) == 6, ( - f"release gate: transform tuple is no longer length 6: " - f"{transform!r}" - ) - # Pixel width and pixel height must round-trip exactly; the origin - # is the top-left corner derived from pixel-center coords plus a - # half-pixel offset, so it is also a tight equality. - assert transform[0] == pytest.approx(30.0, abs=1e-9), ( - f"release gate: pixel_width drifted: {transform!r}" - ) - assert transform[4] == pytest.approx(-30.0, abs=1e-9), ( - f"release gate: pixel_height sign or magnitude drifted: " - f"{transform!r}" - ) - assert transform[1] == 0.0 and transform[3] == 0.0, ( - f"release gate: shear terms appeared in axis-aligned write: " - f"{transform!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_local_write_preserves_nodata(tmp_path) -> None: - """A declared nodata sentinel survives the write -> read round trip.""" - sentinel = -9999.0 - da = _make_data_array(nodata=sentinel) - path = str(tmp_path / "release_gate_local_write_nodata_2340.tif") - to_geotiff(da, path, compression="none", tiled=False, nodata=sentinel) - - out = open_geotiff(path) - nodata = out.attrs.get("nodata") - assert nodata is not None, ( - "release gate: declared nodata was dropped on write -> read; " - "the release contract promises the sentinel survives" - ) - assert float(nodata) == pytest.approx(sentinel, abs=0.0), ( - f"release gate: ``attrs['nodata']`` drifted from {sentinel} to " - f"{nodata!r}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_negative_2341.py b/xrspatial/geotiff/tests/test_release_gate_negative_2341.py deleted file mode 100644 index 3a36c4e3..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_negative_2341.py +++ /dev/null @@ -1,500 +0,0 @@ -"""Release-gate negative cases: ambiguous metadata fails closed (issue #2341 PR 5). - -Epic #2341's acceptance criteria include the promise that "unsupported -or ambiguous metadata fails loudly instead of flattening or guessing". -The positive release-gate suites in ``test_release_gate_*.py`` lock the -"works" path. This file pins the negative side: when metadata is -ambiguous and the caller did NOT opt in via the documented flag, every -promised read entry point raises a typed error whose message names the -unlocking flag and points at the release-contract docs. - -Four parametrized cases: - -1. Conflicting CRS between the GeoTIFF header and a sibling ``.aux.xml`` - PAM sidecar. The reader must not silently prefer one over the other. -2. Integer nodata sentinel that cannot be honoured on a float-promoted - raster. The reader must not silently mask with the wrong value. -3. Rotated affine transform without ``allow_rotated=True``. Eager, - dask, and windowed entry points must each raise uniformly. -4. Mixed-tier VRT children when the caller asked for stable-only - sources. The reader must name the offending child and the opt-in. - -Assertions inlined per case so a single failing row is locatable -without cross-file helpers. Sibling PRs of epic #2341 are running in -parallel; this file does NOT introduce a shared helper module. - -xfail policy ------------- -Cases 1, 2, and 4 are tagged ``strict=False`` xfail because the -production-side rejection is tracked under a separate epic or a -follow-up issue: - -* Case 1 -- ``.aux.xml`` PAM sidecar read is not a supported feature - today (no entry in :data:`xrspatial.geotiff.SUPPORTED_FEATURES`). - The release promise is that *if* the reader gains PAM sidecar - support, it must fail closed on a CRS conflict; the xfail flips - to a pass the moment that production fix lands. -* Case 2 -- the current behaviour on a non-finite or fractional - integer nodata sentinel is a no-op (issue #1774). The release - promise is to upgrade that no-op to a typed error so the caller - sees the silent-coercion risk; the xfail flips when the upgrade - lands. -* Case 4 -- the VRT stable-only knob is owned by epic #2342 and - has not landed. The xfail flips when #2342 ships the knob. - -Case 3 (rotated) actively passes today on every cited entry point. -""" -from __future__ import annotations - -import struct -import uuid -from pathlib import Path - -import numpy as np -import pytest - -from xrspatial.geotiff import open_geotiff, read_geotiff_dask -from xrspatial.geotiff._errors import ( - GeoTIFFAmbiguousMetadataError, - RotatedTransformError, -) -from xrspatial.geotiff.tests.conftest import requires_gpu - -# --------------------------------------------------------------------------- # -# Shared release-contract pointers. # -# --------------------------------------------------------------------------- # -# -# The release-contract doc lives next to the release-gate checklist. The -# negative-case error messages should mention either the contract page, -# the issue number, or the audit checklist so a caller hitting the error -# can find the promise the reader is enforcing. -_RELEASE_CONTRACT_HINTS = ( - "release_gate_geotiff", - "geotiff_release_contract", - "#2341", - "#1987", - "#2342", - "release contract", -) - - -def _msg_cites_release_contract(msg: str) -> bool: - """True iff *msg* names the release-contract docs or the epic. - - The error message contract is loose on purpose: any of the four - pointers above counts. We do not pin one exact docs path because - the docs file may move (``.rst`` vs ``.md``) without that being a - release-gate regression. - """ - return any(hint in msg for hint in _RELEASE_CONTRACT_HINTS) - - -def _tmp(tmp_path, label: str, *, suffix: str = ".tif") -> str: - """Return a unique temp file path scoped to this test file. - - Sibling PRs of epic #2341 run in parallel against the same shared - tmp dirs in CI. Including ``2341`` and a per-call UUID keeps file - names from colliding across worktrees and across parametrized - cases inside the same test. ``suffix`` lets callers pick ``.vrt`` - or another extension without doing fragile string replacement on - the returned path. - """ - return str( - tmp_path / f"release_gate_neg_2341_{label}_{uuid.uuid4().hex}{suffix}" - ) - - -# --------------------------------------------------------------------------- # -# Synthetic rotated GeoTIFF (case 3). # -# --------------------------------------------------------------------------- # -# -# Hand-rolled TIFF with a ModelTransformationTag carrying a non-zero -# rotation. Going via ``to_geotiff`` would not exercise the rotated -# branch -- the writer refuses rotated transforms at the boundary, so a -# round-trip through xrspatial cannot reproduce one. The 30-degree -# rotation matches ``test_allow_rotated_geotiff_2115.py`` so the gate -# rejects the same input shape that test pins behaviourally. -# -# Canonical copy of the rotated-matrix constants and the TIFF builder -# lives in ``test_allow_rotated_geotiff_2115.py``. The duplication here -# is intentional -- the four sibling PRs of epic #2341 share no helper -# module to avoid cross-PR symbol collisions. If the canonical copy -# drifts, mirror the change here in a follow-up PR. - -_TAG_MODEL_TRANSFORMATION = 34264 -_COS30 = 0.8660254037844387 -_SIN30 = 0.5 -_ROTATED_M = ( - 10.0 * _COS30, -10.0 * _SIN30, 0.0, 100.0, - 10.0 * _SIN30, 10.0 * _COS30, 0.0, 200.0, - 0.0, 0.0, 1.0, 0.0, - 0.0, 0.0, 0.0, 1.0, -) - - -def _write_rotated_tiff(path: str, arr: np.ndarray) -> None: - """Emit a minimal little-endian TIFF with a rotated ModelTransformationTag. - - Lifted from ``test_allow_rotated_geotiff_2115.py``. Inlined here so - the negative gate does not introduce a shared helper module across - the sibling PRs of epic #2341. - """ - h, w = arr.shape - arr = np.ascontiguousarray(arr.astype(' None: - """The reader must not silently choose between header CRS and sidecar CRS. - - The release promise: when a ``.aux.xml`` sidecar declares a CRS that - disagrees with the GeoTIFF header, the reader raises with a message - naming both sources and the opt-in flag that would resolve the - ambiguity. - """ - from xrspatial.geotiff._writer import write - from xrspatial.geotiff._geotags import GeoTransform - - path = _tmp(tmp_path, "case1_aux_xml_crs") - pixels = np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32) - write( - pixels, - path, - geo_transform=GeoTransform( - origin_x=0.0, origin_y=0.0, - pixel_width=1.0, pixel_height=-1.0, - ), - crs_epsg=4326, - compression="none", - tiled=False, - ) - # PAM sidecar declaring a *different* CRS than the header (3857 vs 4326). - sidecar = Path(path + ".aux.xml") - sidecar.write_text( - '\n' - '\n' - ' EPSG:3857\n' - '\n', - encoding="utf-8", - ) - with pytest.raises(GeoTIFFAmbiguousMetadataError) as excinfo: - open_geotiff(path) - msg = str(excinfo.value) - assert "aux.xml" in msg or "sidecar" in msg or "PAM" in msg, ( - f"expected the error message to name the .aux.xml / PAM sidecar; " - f"got: {msg!r}" - ) - assert _msg_cites_release_contract(msg), ( - f"expected the error message to cite the release-contract docs " - f"or the tracking issue; got: {msg!r}" - ) - - -# --------------------------------------------------------------------------- # -# Case 2: integer nodata on a float-promoted raster. # -# --------------------------------------------------------------------------- # - - -def _build_uint16_tiff_with_nodata(nodata_str: str, path: str) -> None: - """Emit a 2x2 uint16 TIFF whose GDAL_NODATA tag holds ``nodata_str``. - - Mirrors ``_build_uint16_tiff`` in ``test_nodata_nan_int_1774.py`` so - this gate uses the same byte-level shape that file behaviourally - pins. Inlined to keep the file self-contained. - """ - bo = '<' - width, height = 2, 2 - pixels = np.array([[10, 20], [30, 40]], dtype=np.uint16) - - nodata_bytes = nodata_str.encode('ascii') + b'\x00' - - tag_list: list[tuple[int, int, int, bytes]] = [] - - def add_short(tag: int, val: int) -> None: - tag_list.append((tag, 3, 1, struct.pack(f'{bo}H', val))) - - def add_long(tag: int, val: int) -> None: - tag_list.append((tag, 4, 1, struct.pack(f'{bo}I', val))) - - def add_ascii(tag: int, data: bytes) -> None: - tag_list.append((tag, 2, len(data), data)) - - add_short(256, width) - add_short(257, height) - add_short(258, 16) - add_short(259, 1) - add_short(262, 1) - add_short(277, 1) - add_short(278, height) - add_short(339, 1) - add_long(273, 0) - add_long(279, len(pixels.tobytes())) - add_ascii(42113, nodata_bytes) # GDAL_NODATA - - tag_list.sort(key=lambda t: t[0]) - - header_size = 8 - num_entries = len(tag_list) - ifd_size = 2 + 12 * num_entries + 4 - ifd_off = header_size - - overflow = bytearray() - overflow_start = header_size + ifd_size - - overflow_offsets: dict[int, int | None] = {} - for tag, _typ, _count, raw in tag_list: - if len(raw) > 4: - overflow_offsets[tag] = len(overflow) - overflow.extend(raw) - if len(overflow) % 2: - overflow.append(0) - else: - overflow_offsets[tag] = None - - pixel_start = overflow_start + len(overflow) - - patched: list[tuple[int, int, int, bytes]] = [] - for tag, typ, count, raw in tag_list: - if tag == 273: - patched.append( - (tag, typ, count, struct.pack(f'{bo}I', pixel_start)) - ) - else: - patched.append((tag, typ, count, raw)) - tag_list = patched - - out = bytearray() - out.extend(b'II') - out.extend(struct.pack(f'{bo}H', 42)) - out.extend(struct.pack(f'{bo}I', ifd_off)) - out.extend(struct.pack(f'{bo}H', num_entries)) - for tag, typ, count, raw in tag_list: - out.extend(struct.pack(f'{bo}HHI', tag, typ, count)) - if len(raw) <= 4: - out.extend(raw.ljust(4, b'\x00')) - else: - ptr = overflow_start + overflow_offsets[tag] - out.extend(struct.pack(f'{bo}I', ptr)) - out.extend(struct.pack(f'{bo}I', 0)) - out.extend(overflow) - out.extend(pixels.tobytes()) - - with open(path, 'wb') as f: - f.write(out) - - -@pytest.mark.xfail( - reason=( - "Issue #1774 currently treats a non-finite or fractional " - "integer nodata sentinel as a silent no-op rather than a hard " - "error. The release promise is to upgrade the no-op to a " - "typed rejection so the caller sees the silent-coercion risk; " - "this xfail flips to a pass when the upgrade lands." - ), - strict=False, -) -def test_release_gate_negative_integer_nodata_float_promoted(tmp_path) -> None: - """The reader must not silently coerce a non-finite int-file nodata sentinel. - - A uint16 file with ``GDAL_NODATA="nan"`` would otherwise be - masked with the wrong sentinel (or silently ignored). The release - promise: raise with a message naming the unlocking opt-in. - """ - path = _tmp(tmp_path, "case2_int_nodata_float_promoted") - _build_uint16_tiff_with_nodata("nan", path) - with pytest.raises(GeoTIFFAmbiguousMetadataError) as excinfo: - open_geotiff(path) - msg = str(excinfo.value) - assert "nodata" in msg.lower(), ( - f"expected the error message to name nodata; got: {msg!r}" - ) - assert _msg_cites_release_contract(msg), ( - f"expected the error message to cite the release-contract docs " - f"or the tracking issue; got: {msg!r}" - ) - - -# --------------------------------------------------------------------------- # -# Case 3: rotated transform without ``allow_rotated=True``. # -# --------------------------------------------------------------------------- # -# -# Three sub-cases parametrized over the eager, dask, and windowed entry -# points -- the release promise is that each path raises the same typed -# error with a message that names ``allow_rotated``. - - -_ROTATED_PIXELS = np.arange(20, dtype=' str: - """A throwaway rotated GeoTIFF that case 3's three sub-tests share.""" - path = _tmp(tmp_path, "case3_rotated") - _write_rotated_tiff(path, _ROTATED_PIXELS) - return path - - -def _assert_rotated_message(msg: str) -> None: - """Shared assertions on the rotated error message. - - Inlined rather than promoted to a shared helper module so a - parallel sibling PR cannot accidentally rebind the function. - """ - assert "allow_rotated" in msg, ( - f"expected the error message to name the ``allow_rotated`` " - f"opt-in; got: {msg!r}" - ) - # ``reader.allow_rotated`` is tagged ``experimental`` in - # SUPPORTED_FEATURES (see ``_attrs.py``). The check accepts any of - # the three promised tier strings so a future promotion or - # demotion in the same PR as the message edit does not break the - # gate; if the tier moves, update the message text in the same PR - # that moves the row in ``release_gate_geotiff.rst``. - assert any(tier in msg for tier in ("advanced", "experimental", "stable")), ( - f"expected the error message to name the feature tier; " - f"got: {msg!r}" - ) - assert _msg_cites_release_contract(msg), ( - f"expected the error message to cite the release-contract docs " - f"or the tracking issue; got: {msg!r}" - ) - - -def test_release_gate_negative_rotated_eager(rotated_geotiff_path) -> None: - """Eager numpy path raises ``RotatedTransformError`` without the opt-in.""" - with pytest.raises(RotatedTransformError) as excinfo: - open_geotiff(rotated_geotiff_path) - _assert_rotated_message(str(excinfo.value)) - - -def test_release_gate_negative_rotated_dask(rotated_geotiff_path) -> None: - """Dask path raises the same typed error, uniformly with the eager path.""" - with pytest.raises(RotatedTransformError) as excinfo: - read_geotiff_dask(rotated_geotiff_path, chunks=2) - _assert_rotated_message(str(excinfo.value)) - - -def test_release_gate_negative_rotated_windowed(rotated_geotiff_path) -> None: - """Windowed read raises the same typed error before pixel decode.""" - with pytest.raises(RotatedTransformError) as excinfo: - open_geotiff(rotated_geotiff_path, window=(0, 0, 2, 2)) - _assert_rotated_message(str(excinfo.value)) - - -@requires_gpu -def test_release_gate_negative_rotated_gpu(rotated_geotiff_path) -> None: - """GPU read raises the same typed error as the CPU paths. - - ``reader.gpu`` is the ``experimental`` tier in - :data:`xrspatial.geotiff.SUPPORTED_FEATURES`. The release promise - is loose for GPU (behaviour can change without a deprecation - window) but the rotated-transform refusal is upstream of the - GPU decode path -- the validator fires on the header read, before - any pixel buffer reaches the GPU -- so the same typed error - surfaces here regardless of the GPU tier. - """ - with pytest.raises(RotatedTransformError) as excinfo: - open_geotiff(rotated_geotiff_path, gpu=True) - _assert_rotated_message(str(excinfo.value)) - - -# --------------------------------------------------------------------------- # -# Case 4: mixed-tier VRT children when stable-only is requested. # -# --------------------------------------------------------------------------- # - - -@pytest.mark.xfail( - reason=( - "The VRT stable-only knob is owned by epic #2342 and has not " - "landed yet. The release promise: when the caller asks for " - "stable-only sources and a VRT child uses an experimental " - "codec, the reader names the offending child and the opt-in " - "flag. This xfail flips to a pass when #2342 ships the knob." - ), - strict=False, -) -def test_release_gate_negative_mixed_tier_vrt_children(tmp_path) -> None: - """The reader must refuse mixed-tier VRT children when stable-only is asked. - - The release promise: ``open_geotiff(vrt, stable_only=True)`` - rejects a VRT whose child uses an experimental-tier codec, names - the offending child, and names the opt-in - (``allow_experimental_codecs=True``) plus the feature tier. - - XFAIL-to-PASS transition note - ----------------------------- - Today this test fails with ``TypeError: unexpected keyword - argument 'stable_only'`` because epic #2342 has not landed the - kwarg yet. The strict=False xfail swallows that TypeError. - When #2342 lands, the test will start raising - :class:`GeoTIFFAmbiguousMetadataError` (or fail to raise) and - the xfail will report XPASS. Before removing the xfail marker, - confirm the new code path satisfies both inline assertions: the - error message must mention either ``stable_only`` or - ``allow_experimental_codecs``, and it must cite the release - contract docs. If either assertion would not pass, fix the - production message in the same PR that removes the xfail. - """ - path = _tmp(tmp_path, "case4_mixed_tier_vrt", suffix=".vrt") - Path(path).write_text( - '\n', - encoding="utf-8", - ) - with pytest.raises(GeoTIFFAmbiguousMetadataError) as excinfo: - open_geotiff(path, stable_only=True) # type: ignore[call-arg] - msg = str(excinfo.value) - assert "stable_only" in msg or "allow_experimental_codecs" in msg, ( - f"expected the error message to name the unlocking opt-in; " - f"got: {msg!r}" - ) - assert _msg_cites_release_contract(msg), ( - f"expected the error message to cite the release-contract docs " - f"or the tracking issue; got: {msg!r}" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_overview_sidecar_metadata_2341.py b/xrspatial/geotiff/tests/test_release_gate_overview_sidecar_metadata_2341.py deleted file mode 100644 index e0e5398c..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_overview_sidecar_metadata_2341.py +++ /dev/null @@ -1,468 +0,0 @@ -"""Release-gate: overview / sidecar metadata survival (epic #2341, PR 3 of 5). - -Epic #2341 flags "overview reads lose CRS/transform/nodata metadata" as a -priority risk. The existing overview tests under -``xrspatial/geotiff/tests/`` (``test_cog_overview_nodata_1613.py``, -``test_dask_overview_level.py``, ``test_cog_cubic_overview_nodata_1623.py``, -etc.) assert pixel correctness or specific nodata behaviour on the overview -output. They do not pin full metadata survival on every overview level, and -they do not pin parity between internal COG overviews and external -``.ovr`` sidecars. - -This module pins that contract on both read paths (eager and dask): - -* For an internal-overview COG with base + overview levels at factors 2 and - 4, the per-level ``attrs`` agree on ``crs``, ``crs_wkt``, - ``georef_status``, ``raster_type``, ``nodata``, and ``masked_nodata``. - The ``transform`` field scales pixel size by the level factor while - keeping the origin fixed. - -* For a file whose overviews live in an external ``.ovr`` sidecar at the - same factors, the same metadata-survival contract holds. - -Both fixtures are constructed in-test so the test exercises the writer -path that produces them. If the writer regresses, this gate breaks too -- -that is the intended coupling for a release gate. - -Out of scope (covered elsewhere): pixel correctness of the resampling -kernels (``test_cog_overview_nodata_1613.py``, -``test_cog_cubic_overview_nodata_1623.py``); VRT mosaics (epic #2342). -""" -from __future__ import annotations - -import os -import uuid - -import numpy as np -import pytest -import xarray as xr - -# ``rasterio`` is the only writer that emits a ``.tif.ovr`` sidecar through -# its ``TIFF_USE_OVR`` env hint. The internal-overview path uses -# ``to_geotiff(cog=True, overview_levels=...)`` so it does not depend on -# rasterio. -rasterio = pytest.importorskip("rasterio") -pytest.importorskip("dask.array") - -from rasterio.enums import Resampling # noqa: E402 -from xrspatial.geotiff import ( # noqa: E402 - open_geotiff, - read_geotiff_dask, - to_geotiff, -) - - -# --------------------------------------------------------------------------- -# Constants for the in-test fixtures. -# --------------------------------------------------------------------------- - -# Base raster is 64x64 so factors of 2 and 4 give clean 32x32 and 16x16 -# overviews. -_BASE_SIZE = 64 -_OVERVIEW_FACTORS = (2, 4) -# Pixel size of 1 in projected units; origin at (-120, 45) so the test -# also catches a regression where the origin is silently rewritten as the -# overview is read. -_BASE_TRANSFORM = (1.0, 0.0, -120.0, 0.0, -1.0, 45.0) -_BASE_CRS = 4326 -_NODATA = -9999.0 - -# Metadata keys the release-gate contract requires equal across all -# overview levels. Any drift on these keys means a downstream caller that -# branches on them will see a different file at the overview level than -# at the base. -_EQUAL_KEYS = ( - "crs", - "crs_wkt", - "georef_status", - "raster_type", - "nodata", - "masked_nodata", -) - - -def _make_raster() -> xr.DataArray: - """Return a 64x64 float32 DataArray with one NaN cell. - - The NaN is rewritten to the sentinel on write, exercising the - ``masked_nodata`` lifecycle so the readback restores NaN and stamps - ``masked_nodata=True`` on every level. - """ - arr = np.arange(_BASE_SIZE * _BASE_SIZE, - dtype=np.float32).reshape(_BASE_SIZE, _BASE_SIZE) - arr[0, 0] = np.nan - return xr.DataArray( - arr, - dims=("y", "x"), - attrs={"transform": _BASE_TRANSFORM, "crs": _BASE_CRS}, - ) - - -def _unique_tmp_path(tmp_path, label: str) -> str: - """Return a unique path inside ``tmp_path`` tagged with the issue number. - - Parallel sibling agents share the pytest tmp root in CI worker - layouts; include the issue number plus a uuid so a stray collision - does not cross-pollute between PRs of epic #2341. - """ - return str(tmp_path / f"release_gate_2341_{label}_{uuid.uuid4().hex}.tif") - - -def _write_internal_overview_cog(path: str) -> None: - """Write a COG with base + internal overviews at factors 2 and 4. - - Asserts the writer actually emitted the requested overview IFDs. - Without this guard, a regression where ``to_geotiff(cog=True, - overview_levels=[2, 4])`` silently drops the overview chain would - only surface downstream as a shape-mismatch in the reader, far - from the writer call that caused it. - """ - da = _make_raster() - to_geotiff( - da, path, - nodata=_NODATA, - cog=True, - compression="deflate", - tiled=True, - tile_size=16, - overview_levels=list(_OVERVIEW_FACTORS), - overview_resampling="nearest", - ) - with rasterio.open(path) as ds: - assert ds.overviews(1) == list(_OVERVIEW_FACTORS), ( - f"writer did not emit the requested overview IFDs: " - f"got {ds.overviews(1)}, expected {list(_OVERVIEW_FACTORS)}" - ) - - -def _write_external_sidecar(path: str) -> None: - """Write a tiled TIFF + ``.ovr`` sidecar at factors 2 and 4. - - The xrspatial writer does not emit external sidecars; build the - sidecar by reopening the tiled base file with ``TIFF_USE_OVR=YES`` so - GDAL routes overview IFDs into ``.ovr`` instead of appending - them to the base file. This is the same path - ``golden_corpus/generate.py`` uses for the bundled - ``overview_external_ovr_uint16`` fixture. - """ - da = _make_raster() - to_geotiff( - da, path, - nodata=_NODATA, - tiled=True, - tile_size=16, - ) - # Sanity: the base must not already carry internal overviews. - with rasterio.open(path) as ds: - assert ds.overviews(1) == [], ( - "base file must have no internal overviews before sidecar build" - ) - - with rasterio.Env(TIFF_USE_OVR="YES", COMPRESS_OVERVIEW="DEFLATE"): - with rasterio.open(path, "r+") as ds: - ds.build_overviews(list(_OVERVIEW_FACTORS), Resampling.nearest) - assert os.path.exists(path + ".ovr"), ( - "TIFF_USE_OVR=YES must produce a .ovr sidecar next to the base file" - ) - - -# --------------------------------------------------------------------------- -# Inlined assertion helpers. The release-gate epic explicitly forbids a -# shared helper module between PRs of #2341 (four sibling agents work in -# parallel); keep these private to this module. -# --------------------------------------------------------------------------- - -def _assert_metadata_equal_across_levels(attrs_by_level: dict) -> None: - """Assert every key in ``_EQUAL_KEYS`` agrees across overview levels. - - Absent-on-every-level is also fine (the contract is "equal", not - "present"). The reader does not stamp ``raster_type`` for the default - RasterPixelIsArea, so this branch covers the common case where the - key is absent on every level. - - Float / int equality on ``nodata`` is intentional: a downstream - caller comparing ``attrs['nodata']`` to a sentinel uses ``==``, which - treats ``-9999.0`` and ``-9999`` as equal. The release contract is - "equality under ``==``", not "identical Python type". - """ - base = attrs_by_level[0] - for key in _EQUAL_KEYS: - base_present = key in base - base_val = base.get(key) - for lvl, attrs in attrs_by_level.items(): - if lvl == 0: - continue - other_present = key in attrs - other_val = attrs.get(key) - assert other_present == base_present, ( - f"attrs[{key!r}] presence drifts: base={base_present}, " - f"level={lvl}: {other_present}" - ) - if base_present: - assert other_val == base_val, ( - f"attrs[{key!r}] differs across levels: " - f"base={base_val!r} level={lvl}: {other_val!r}" - ) - - -def _assert_transform_scales(attrs_by_level: dict, factors: dict) -> None: - """Assert ``transform`` scales by ``factors[level]`` with the origin held. - - ``factors`` maps overview-level index -> decimation factor (the base - level passes a factor of 1). Pixel sizes (``a`` and ``e``) multiply - by the factor; the origin (``c`` and ``f``) is held fixed; the - rotation terms (``b`` and ``d``) stay zero for the axis-aligned - fixtures this test builds. - """ - base = attrs_by_level[0]["transform"] - base_a, base_b, base_c, base_d, base_e, base_f = base - for lvl, attrs in attrs_by_level.items(): - factor = factors[lvl] - t = attrs["transform"] - a, b, c, d, e, f = t - assert a == pytest.approx(base_a * factor), ( - f"level {lvl}: pixel width did not scale by {factor}: " - f"got {a}, expected {base_a * factor}" - ) - assert e == pytest.approx(base_e * factor), ( - f"level {lvl}: pixel height did not scale by {factor}: " - f"got {e}, expected {base_e * factor}" - ) - assert c == pytest.approx(base_c), ( - f"level {lvl}: origin x drifted: got {c}, expected {base_c}" - ) - assert f == pytest.approx(base_f), ( - f"level {lvl}: origin y drifted: got {f}, expected {base_f}" - ) - assert b == pytest.approx(0.0) and d == pytest.approx(0.0), ( - f"level {lvl}: axis-aligned transform must not gain rotation " - f"terms, got b={b}, d={d}" - ) - - -def _read_levels_eager(path: str) -> dict: - """Read base + each overview level via ``open_geotiff``. - - Returns a dict keyed by level index (0 = base, 1 = first overview, - ...) where the value is the read ``DataArray``. - """ - out = {0: open_geotiff(path)} - for i, _ in enumerate(_OVERVIEW_FACTORS, start=1): - out[i] = open_geotiff(path, overview_level=i) - return out - - -def _read_levels_dask(path: str) -> dict: - """Read base + each overview level via ``read_geotiff_dask``. - - The chunk size is intentionally small (8) so per-chunk reads - cover at least one chunk boundary at every level; a regression where - the dask graph drops attrs on assembly would surface here. - """ - out = {0: read_geotiff_dask(path, chunks=8)} - for i, _ in enumerate(_OVERVIEW_FACTORS, start=1): - out[i] = read_geotiff_dask(path, chunks=8, overview_level=i) - return out - - -def _factors_by_level() -> dict: - """Map level index (0=base, 1, 2, ...) to its decimation factor.""" - factors = {0: 1} - for i, f in enumerate(_OVERVIEW_FACTORS, start=1): - factors[i] = f - return factors - - -# --------------------------------------------------------------------------- -# Internal COG overviews. -# --------------------------------------------------------------------------- - -@pytest.mark.parametrize("reader", ["eager", "dask"]) -def test_cog_internal_overview_metadata_survives(tmp_path, reader): - """COG with internal overviews preserves the metadata contract. - - Asserts that ``crs``, ``crs_wkt``, ``georef_status``, ``raster_type``, - ``nodata``, and ``masked_nodata`` agree across base + every overview - level for both the eager and dask read paths. - """ - path = _unique_tmp_path(tmp_path, f"cog_meta_{reader}") - _write_internal_overview_cog(path) - - if reader == "eager": - levels = _read_levels_eager(path) - else: - levels = _read_levels_dask(path) - - attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} - _assert_metadata_equal_across_levels(attrs_by_level) - - # Sanity: the contract requires these keys to actually be present on - # the base (otherwise "equal across levels" trivially holds with - # everything absent). The constructed fixture carries CRS + nodata, - # so the read must surface them. - base = attrs_by_level[0] - assert base.get("crs") == _BASE_CRS - assert base.get("crs_wkt"), "crs_wkt must be set on a CRS-carrying COG" - assert base.get("nodata") == _NODATA - assert base.get("masked_nodata") is True - assert base.get("georef_status") == "full" - - -@pytest.mark.parametrize("reader", ["eager", "dask"]) -def test_cog_internal_overview_transform_scales(tmp_path, reader): - """COG with internal overviews preserves transform origin and scales pixel size.""" - path = _unique_tmp_path(tmp_path, f"cog_xform_{reader}") - _write_internal_overview_cog(path) - - if reader == "eager": - levels = _read_levels_eager(path) - else: - levels = _read_levels_dask(path) - - attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} - _assert_transform_scales(attrs_by_level, _factors_by_level()) - - -def test_cog_internal_overview_shape_matches_factors(tmp_path): - """Smoke: shapes follow the decimation factors so the test exercises real overview IFDs. - - Catches a regression where the reader silently returns the base - image when asked for an overview level. Independent of the metadata - contract: shape comes from the IFD, not ``attrs``. - """ - path = _unique_tmp_path(tmp_path, "cog_shape") - _write_internal_overview_cog(path) - - base = open_geotiff(path) - assert base.shape == (_BASE_SIZE, _BASE_SIZE) - for i, factor in enumerate(_OVERVIEW_FACTORS, start=1): - da = open_geotiff(path, overview_level=i) - expected = _BASE_SIZE // factor - assert da.shape == (expected, expected), ( - f"overview_level={i} returned shape {da.shape}, " - f"expected ({expected}, {expected})" - ) - - -# --------------------------------------------------------------------------- -# External `.ovr` sidecar. -# --------------------------------------------------------------------------- - -@pytest.mark.parametrize("reader", ["eager", "dask"]) -def test_sidecar_overview_metadata_survives(tmp_path, reader): - """External `.ovr` sidecar preserves the metadata contract. - - Same contract as the internal-overview test: CRS, georef status, - nodata pair, raster_type all agree across base + sidecar overview - levels. - """ - path = _unique_tmp_path(tmp_path, f"sidecar_meta_{reader}") - _write_external_sidecar(path) - - if reader == "eager": - levels = _read_levels_eager(path) - else: - levels = _read_levels_dask(path) - - attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} - _assert_metadata_equal_across_levels(attrs_by_level) - - # Same presence sanity check as the COG test: the fixture is built - # with CRS + nodata so the gate covers a real read, not an - # everything-absent vacuum. - base = attrs_by_level[0] - assert base.get("crs") == _BASE_CRS - assert base.get("crs_wkt"), ( - "crs_wkt must be set when the base file carries an EPSG code" - ) - assert base.get("nodata") == _NODATA - assert base.get("masked_nodata") is True - assert base.get("georef_status") == "full" - - -@pytest.mark.parametrize("reader", ["eager", "dask"]) -def test_sidecar_overview_transform_scales(tmp_path, reader): - """External `.ovr` sidecar scales pixel size by 2 per level, origin held.""" - path = _unique_tmp_path(tmp_path, f"sidecar_xform_{reader}") - _write_external_sidecar(path) - - if reader == "eager": - levels = _read_levels_eager(path) - else: - levels = _read_levels_dask(path) - - attrs_by_level = {lvl: da.attrs for lvl, da in levels.items()} - _assert_transform_scales(attrs_by_level, _factors_by_level()) - - -def test_sidecar_overview_shape_matches_factors(tmp_path): - """Smoke: sidecar reads return the right shape per level. - - Mirrors the COG smoke test so a "reader silently returns base - pixels for any overview level" regression also surfaces here, on - the external-sidecar path. - """ - path = _unique_tmp_path(tmp_path, "sidecar_shape") - _write_external_sidecar(path) - - base = open_geotiff(path) - assert base.shape == (_BASE_SIZE, _BASE_SIZE) - for i, factor in enumerate(_OVERVIEW_FACTORS, start=1): - da = open_geotiff(path, overview_level=i) - expected = _BASE_SIZE // factor - assert da.shape == (expected, expected), ( - f"sidecar overview_level={i} returned shape {da.shape}, " - f"expected ({expected}, {expected})" - ) - - -# --------------------------------------------------------------------------- -# Cross-source parity: internal COG vs external sidecar at matching factors. -# --------------------------------------------------------------------------- - -@pytest.mark.parametrize("reader", ["eager", "dask"]) -def test_internal_vs_sidecar_metadata_agree(tmp_path, reader): - """Internal COG and external sidecar agree on the metadata contract. - - The release-gate epic specifically calls out parity between the two - paths -- if a downstream caller switches a deployment from inline - overviews to a sidecar (or vice versa), the read contract must not - change. The two fixtures share base raster, CRS, transform, and - nodata, so the per-level attrs must agree key-by-key. - """ - cog_path = _unique_tmp_path(tmp_path, f"parity_cog_{reader}") - sidecar_path = _unique_tmp_path(tmp_path, f"parity_sidecar_{reader}") - _write_internal_overview_cog(cog_path) - _write_external_sidecar(sidecar_path) - - if reader == "eager": - cog_levels = _read_levels_eager(cog_path) - sidecar_levels = _read_levels_eager(sidecar_path) - else: - cog_levels = _read_levels_dask(cog_path) - sidecar_levels = _read_levels_dask(sidecar_path) - - assert set(cog_levels) == set(sidecar_levels) - for lvl in cog_levels: - for key in _EQUAL_KEYS: - cog_attrs = cog_levels[lvl].attrs - sidecar_attrs = sidecar_levels[lvl].attrs - assert (key in cog_attrs) == (key in sidecar_attrs), ( - f"level {lvl}: attrs[{key!r}] presence differs between " - f"internal-COG and sidecar reads " - f"(cog={key in cog_attrs}, sidecar={key in sidecar_attrs})" - ) - if key in cog_attrs: - assert cog_attrs[key] == sidecar_attrs[key], ( - f"level {lvl}: attrs[{key!r}] differs between " - f"internal-COG and sidecar reads: " - f"cog={cog_attrs[key]!r}, sidecar={sidecar_attrs[key]!r}" - ) - # Transform parity at every level: the two fixtures use the - # same base transform, so every level's transform must match. - assert cog_levels[lvl].attrs["transform"] == pytest.approx( - sidecar_levels[lvl].attrs["transform"] - ), ( - f"level {lvl}: transform differs between internal-COG and " - f"sidecar reads" - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_windowed_read.py b/xrspatial/geotiff/tests/test_release_gate_windowed_read.py deleted file mode 100644 index c3b69231..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_windowed_read.py +++ /dev/null @@ -1,162 +0,0 @@ -"""Release gate: windowed reads (epic #2340). - -``open_geotiff(path, window=...)`` is part of the stable surface. The -release contract: - -* A ``(row_start, col_start, row_stop, col_stop)`` window returns the - exact subset of the source pixels. -* The result keeps ``attrs['crs']`` and produces a transform whose - origin shifts to the window's top-left pixel corner. -* Reading the full extent via ``window=(0, 0, H, W)`` matches an - unwindowed read. - -Out of bounds and degenerate windows are covered by -``test_window_out_of_bounds_1634.py``; the release-gate test only -locks the supported, in-bounds use case so a release engineer knows -the user-facing API behaves end to end. -""" -from __future__ import annotations - -import numpy as np -import pytest - -from xrspatial.geotiff import open_geotiff -from xrspatial.geotiff._geotags import GeoTransform -from xrspatial.geotiff._writer import write - - -_H = 10 -_W = 10 -# A distinctive per-pixel value (row * 100 + col) means any row / col -# stride confusion in the windowed path fails the equality check. -_PIXELS = ( - np.arange(_H, dtype=np.int32).reshape(-1, 1) * 100 - + np.arange(_W, dtype=np.int32).reshape(1, -1) -).astype(np.int32) -_ORIGIN_X = 500000.0 -_ORIGIN_Y = 4000000.0 -_PIXEL_W = 30.0 -_PIXEL_H = -30.0 - - -def _write_known_good(path: str) -> None: - gt = GeoTransform( - origin_x=_ORIGIN_X, - origin_y=_ORIGIN_Y, - pixel_width=_PIXEL_W, - pixel_height=_PIXEL_H, - ) - write( - _PIXELS, - path, - geo_transform=gt, - crs_epsg=32610, - compression="none", - tiled=False, - ) - - -@pytest.mark.release_gate -def test_release_gate_windowed_read_returns_subset(tmp_path) -> None: - """A windowed read returns exactly the requested subset.""" - path = str(tmp_path / "release_gate_windowed_read_subset_2340.tif") - _write_known_good(path) - - # Take an interior 4x5 window so the test fails if the window - # logic confuses row- and column-order. - row_start, col_start = 2, 3 - row_stop, col_stop = 6, 8 - out = open_geotiff(path, window=(row_start, col_start, row_stop, col_stop)) - - expected = _PIXELS[row_start:row_stop, col_start:col_stop] - assert out.shape == expected.shape, ( - f"release gate: windowed read shape {out.shape} does not match " - f"the requested window shape {expected.shape}" - ) - np.testing.assert_array_equal( - np.asarray(out.values), - expected, - err_msg=( - "release gate: windowed read returned different pixels than " - "the same rows / cols of the source array; this would silently " - "break every downstream caller that relies on window= for " - "subsetting" - ), - ) - - -@pytest.mark.release_gate -def test_release_gate_windowed_read_preserves_crs(tmp_path) -> None: - """A windowed read carries ``attrs['crs']`` over from the source.""" - path = str(tmp_path / "release_gate_windowed_read_crs_2340.tif") - _write_known_good(path) - - out = open_geotiff(path, window=(1, 1, 5, 5)) - crs = out.attrs.get("crs") - assert crs is not None and int(crs) == 32610, ( - f"release gate: windowed read dropped or drifted " - f"``attrs['crs']``: got {crs!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_windowed_read_shifts_transform_origin(tmp_path) -> None: - """The transform origin shifts to the window's top-left pixel. - - Concretely: for a window starting at ``(row, col) = (2, 3)`` on a - grid with pixel width ``+30`` and pixel height ``-30``, the new - origin is ``(origin_x + 3 * 30, origin_y + 2 * -30)``. - """ - path = str(tmp_path / "release_gate_windowed_read_transform_2340.tif") - _write_known_good(path) - - row_start, col_start = 2, 3 - out = open_geotiff(path, window=(row_start, col_start, 6, 8)) - transform = out.attrs.get("transform") - assert transform is not None and len(transform) == 6, ( - f"release gate: windowed read dropped or reshaped transform: " - f"{transform!r}" - ) - # Pixel size must not change. - assert transform[0] == pytest.approx(_PIXEL_W, abs=1e-9), ( - f"release gate: windowed read changed pixel_width: {transform!r}" - ) - assert transform[4] == pytest.approx(_PIXEL_H, abs=1e-9), ( - f"release gate: windowed read changed pixel_height: {transform!r}" - ) - expected_origin_x = _ORIGIN_X + col_start * _PIXEL_W - expected_origin_y = _ORIGIN_Y + row_start * _PIXEL_H - assert transform[2] == pytest.approx(expected_origin_x, abs=1e-9), ( - f"release gate: windowed read origin_x did not shift to the " - f"window's left edge: got {transform[2]!r} expected " - f"{expected_origin_x!r}" - ) - assert transform[5] == pytest.approx(expected_origin_y, abs=1e-9), ( - f"release gate: windowed read origin_y did not shift to the " - f"window's top edge: got {transform[5]!r} expected " - f"{expected_origin_y!r}" - ) - - -@pytest.mark.release_gate -def test_release_gate_windowed_read_full_extent_matches_unwindowed( - tmp_path, -) -> None: - """``window=(0, 0, H, W)`` returns the same pixels as no window.""" - path = str(tmp_path / "release_gate_windowed_read_full_2340.tif") - _write_known_good(path) - - full = open_geotiff(path) - windowed = open_geotiff(path, window=(0, 0, _H, _W)) - assert windowed.shape == full.shape, ( - f"release gate: full-extent window shape drift: " - f"{windowed.shape} vs {full.shape}" - ) - np.testing.assert_array_equal( - np.asarray(windowed.values), - np.asarray(full.values), - err_msg=( - "release gate: full-extent window returned different pixels " - "than the unwindowed read" - ), - ) diff --git a/xrspatial/geotiff/tests/test_release_gate_windowed_reads_2341.py b/xrspatial/geotiff/tests/test_release_gate_windowed_reads_2341.py deleted file mode 100644 index d8d4fb0c..00000000 --- a/xrspatial/geotiff/tests/test_release_gate_windowed_reads_2341.py +++ /dev/null @@ -1,473 +0,0 @@ -"""Release gate: windowed-read coords + shifted-transform parity (epic #2341). - -Epic #2341 names "windowed reads return unshifted transforms" as a -priority risk: a window that returns the file's full-extent transform -but a small array is a footgun that downstream spatial functions silently -trust. This file is the single release-gate citation for that risk. - -For each file in a small representative corpus (integer dtype with a -nodata sentinel, float dtype with NaN nodata, float dtype without a -sentinel, uint8 MinIsWhite stripped without GeoTIFF tags) and for -both eager (``open_geotiff(..., window=...)``) and dask -(``read_geotiff_dask(..., window=...)``) read paths the test asserts: - -* the returned shape equals the window's ``(height, width)``; -* ``coords['y']`` / ``coords['x']`` equal the matching slice of the - full-file coords (bit-exact); -* ``attrs['transform']`` equals - ``T_full * Affine.translation(window.col_off, window.row_off)`` - computed by hand from ``T_full``, with no float drift; -* the canonical non-transform release attrs (``crs``, ``crs_wkt``, - ``nodata``, ``masked_nodata``, ``georef_status``, ``raster_type``) - match the unwindowed read. - -Out of scope here (sibling PRs of epic #2341): overview / sidecar -metadata survival (PR 3 / #2359), stable-codec read/write/read -round-trip (PR 4 / #2360), negative tests for ambiguous metadata -(PR 5 / #2361). - -Assertions are inlined per-file inside this module so the file is the -single citation in ``docs/source/reference/release_gate_geotiff.rst`` -and is self-contained against the four sibling PRs landing in parallel. -""" -from __future__ import annotations - -import importlib.util -import uuid -from pathlib import Path - -import numpy as np -import pytest -import xarray as xr - -from xrspatial.geotiff import open_geotiff, read_geotiff_dask, to_geotiff - - -_HAS_TIFFFILE = importlib.util.find_spec("tifffile") is not None -_skip_no_tifffile = pytest.mark.skipif( - not _HAS_TIFFFILE, reason="tifffile required for MinIsWhite fixture") - - -# --------------------------------------------------------------------------- -# Window geometry under test -# --------------------------------------------------------------------------- -# Strictly interior to the 256x256 fixture so the test fails on any -# off-by-one in row/col offsets and so a wrong window cannot silently -# coincide with the full-extent read. Width != height so a swapped axis -# also fails the shape assertion. Two cases: -# -# * ``aligned``: offsets are a multiple of the dask ``chunks=32`` size, -# so the window starts on a chunk boundary. -# * ``chunk-misaligned``: offsets are NOT a multiple of 32 and the -# window's height / width are also not chunk-aligned, so the dask -# reader has to split chunks. A reader that off-by-ones the -# chunk math at the window origin fails this case but not the -# aligned one. -_FULL_H = 256 -_FULL_W = 256 - - -# ``window=`` kwarg ordering (per the ``open_geotiff`` / -# ``read_geotiff_dask`` docstrings): ``(row_start, col_start, -# row_stop, col_stop)``. -_WINDOWS = ( - pytest.param((32, 64, 96, 192), - id="aligned-row32-col64-h64-w128"), - pytest.param((33, 65, 95, 191), - id="chunk-misaligned-row33-col65-h62-w126"), -) - - -# Pixel geometry pinned on every fixture. Non-square pixels and a -# fractional origin catch any window path that accidentally uses -# integer arithmetic or drops the fractional part of the origin when -# shifting. ``_ORIGIN_X = 500123.5`` is intentionally fractional so a -# windowed reader that internally rounds (or that re-derives the origin -# from int pixel indices) fails the exact-tuple equality on transform. -_PIXEL_WIDTH = 30.0 -_PIXEL_HEIGHT = -25.0 -_ORIGIN_X = 500123.5 -_ORIGIN_Y = 4001987.25 - - -# Canonical non-transform release attrs from the epic #2341 list. -# ``transform`` is asserted separately by the algebraic spec; the -# remaining canonical keys must equal the unwindowed read's values -# exactly because a window is not allowed to silently rewrite them. -# -# ``masked_nodata`` is on the canonical list but is data-dependent by -# design (see ``_attrs.py``: "True iff the in-memory array is float -# dtype and the reader's sentinel-to-NaN step ran"). A window that -# happens to contain no sentinel pixels legitimately flips the flag -# from True to False on an integer source. We assert structure (the -# key is present iff ``nodata`` is present and the value is a bool) -# rather than exact equality so a real attr-drop is still caught and -# a correct content-dependent flip does not flag as silent wrongness. -_NON_TRANSFORM_ATTRS_VALUE_EQUAL = ( - "crs", - "crs_wkt", - "nodata", - "georef_status", - "raster_type", -) -_NON_TRANSFORM_ATTRS_STRUCTURAL = ("masked_nodata",) - - -# --------------------------------------------------------------------------- -# Fixture builders -# --------------------------------------------------------------------------- - -def _build_da(arr: np.ndarray, nodata=None) -> xr.DataArray: - """Wrap ``arr`` as a DataArray with the pinned pixel geometry. - - ``to_geotiff`` infers the on-disk transform from the y/x coords. The - coords are pixel centers; ``origin = pixel_corner``, so the centers - are offset by half a pixel from the origin. The on-disk transform - that comes back through ``attrs['transform']`` is then exactly - ``(_PIXEL_WIDTH, 0, _ORIGIN_X, 0, _PIXEL_HEIGHT, _ORIGIN_Y)``. - """ - h, w = arr.shape - x_centers = _ORIGIN_X + _PIXEL_WIDTH * 0.5 + _PIXEL_WIDTH * np.arange(w) - y_centers = _ORIGIN_Y + _PIXEL_HEIGHT * 0.5 + _PIXEL_HEIGHT * np.arange(h) - attrs = {"crs": 32610} - if nodata is not None: - attrs["nodata"] = nodata - return xr.DataArray( - arr, - dims=("y", "x"), - coords={"y": y_centers.astype(np.float64), - "x": x_centers.astype(np.float64)}, - attrs=attrs, - ) - - -def _write_int16_with_nodata(path: Path) -> None: - rng = np.random.default_rng(2341) - arr = rng.integers( - -1000, 1000, size=(_FULL_H, _FULL_W), dtype=np.int16 - ) - # Sprinkle the sentinel so the read path actually carries it. - arr[10, 10] = -9999 - arr[200, 5] = -9999 - da = _build_da(arr, nodata=-9999) - to_geotiff(da, str(path), compression="deflate", tiled=False) - - -def _write_float32_with_nan_nodata(path: Path) -> None: - rng = np.random.default_rng(2342) - arr = (rng.standard_normal((_FULL_H, _FULL_W)) * 100).astype(np.float32) - # Drop a NaN inside the window region so masked_nodata is exercised. - arr[40, 80] = np.nan - da = _build_da(arr, nodata=np.float32("nan")) - to_geotiff(da, str(path), compression="deflate", tiled=True, - tile_size=64) - - -def _write_float32_no_nodata(path: Path) -> None: - rng = np.random.default_rng(2343) - arr = (rng.standard_normal((_FULL_H, _FULL_W)) * 50).astype(np.float32) - da = _build_da(arr, nodata=None) - to_geotiff(da, str(path), compression="none", tiled=False) - - -def _write_uint8_miniswhite(path: Path) -> None: - """Write a MinIsWhite (photometric=0) uint8 stripped TIFF via tifffile. - - Matches the miniswhite cell in ``test_backend_pixel_parity_matrix_1813.py`` - so this release-gate row covers the same representative layout. - """ - import tifffile # local import: gated by ``_skip_no_tifffile`` - rng = np.random.default_rng(2344) - arr = rng.integers(0, 256, size=(_FULL_H, _FULL_W), dtype=np.uint8) - tifffile.imwrite( - str(path), arr, photometric="miniswhite", - compression="none", metadata=None, - ) - - -_CORPUS = ( - pytest.param(_write_int16_with_nodata, id="int16-deflate-stripped-nodata"), - pytest.param(_write_float32_with_nan_nodata, - id="float32-deflate-tiled-nan-nodata"), - pytest.param(_write_float32_no_nodata, - id="float32-none-stripped-no-nodata"), - pytest.param(_write_uint8_miniswhite, - id="uint8-miniswhite-stripped", - marks=_skip_no_tifffile), -) - - -@pytest.fixture -def corpus_file(tmp_path, request): - """Write a single fixture file and return its on-disk path. - - Each invocation uses a unique filename (issue tag + uuid + builder id) - so sibling rockout worktrees and parallel test runs cannot collide on - the same tmp file. The builder is parametrized at test-call time. - """ - builder = request.param - tag = uuid.uuid4().hex[:8] - path = tmp_path / f"release_gate_2341_{builder.__name__[1:]}_{tag}.tif" - builder(path) - return path - - -# --------------------------------------------------------------------------- -# Algebraic spec for the windowed transform -# --------------------------------------------------------------------------- - -def _expected_window_transform(t_full, col_off, row_off): - """Return ``T_full * Affine.translation(col_off, row_off)`` by hand. - - The spec from epic #2341 is the algebraic identity, not whatever - rasterio's window math happens to compute. For an axis-aligned - transform ``(a, b, c, d, e, f)`` this simplifies to - - T_window = (a, b, c + a*col_off + b*row_off, - d, e, f + d*col_off + e*row_off) - - The b / d terms are kept in the algebra so a future rotated-grid - fixture that exercises this same release gate inherits the right - answer without the test needing to be rewritten. - """ - a, b, c, d, e, f = (float(x) for x in t_full) - return ( - a, - b, - c + a * col_off + b * row_off, - d, - e, - f + d * col_off + e * row_off, - ) - - -# --------------------------------------------------------------------------- -# Backend matrix -# --------------------------------------------------------------------------- - -def _open_eager(path, *, window=None): - return open_geotiff(str(path), window=window) - - -def _open_dask(path, *, window=None): - # ``chunks`` is required for ``read_geotiff_dask``; pick a value - # that produces more than one chunk over the window so the chunk - # math along the window origin is exercised. - return read_geotiff_dask(str(path), window=window, chunks=32) - - -_READERS = ( - pytest.param(_open_eager, id="eager"), - pytest.param(_open_dask, id="dask"), -) - - -# --------------------------------------------------------------------------- -# The release-gate assertions -# --------------------------------------------------------------------------- - -def _assert_shape(out, *, expected_h, expected_w): - assert out.shape == (expected_h, expected_w), ( - f"release gate: windowed read shape {out.shape} does not equal " - f"the requested window shape {(expected_h, expected_w)}; a window " - f"that returns the wrong shape is silently wrong, not noisily wrong" - ) - - -def _assert_coords_slice(windowed, full, - *, row_off, col_off, height, width): - """The window's coords must be the bit-exact slice of the full coords. - - Computing the windowed coords from the shifted transform and - comparing to the slice of the full coords pins both routes: any - drift between ``transform`` shift and coord arithmetic shows up - here, not at the next downstream call site. - """ - full_y = np.asarray(full.coords["y"].values) - full_x = np.asarray(full.coords["x"].values) - win_y = np.asarray(windowed.coords["y"].values) - win_x = np.asarray(windowed.coords["x"].values) - np.testing.assert_array_equal( - win_y, - full_y[row_off:row_off + height], - err_msg=( - "release gate: windowed read y-coords are not a slice of the " - "unwindowed read's y-coords; downstream callers that join on " - "y will silently mismatch" - ), - ) - np.testing.assert_array_equal( - win_x, - full_x[col_off:col_off + width], - err_msg=( - "release gate: windowed read x-coords are not a slice of the " - "unwindowed read's x-coords" - ), - ) - - -def _assert_transform_shifted(windowed, full, *, col_off, row_off): - """``attrs['transform']`` must equal ``T_full * translation(col, row)``. - - Asserted via exact tuple equality (not ``approx``) because the spec - says no float drift: the contract is the algebraic identity, and - any tolerance here would let a buggy windowed reader pass by - re-deriving the origin from the y/x coord arrays (where floating - rounding can creep in). - - When the source has no georef tags, ``transform`` is absent from - both reads (see issue #1710 / ``_coords.py`` -- the reader drops the - synthesised unit transform on non-georef sources). In that case the - contract is that *neither* read has a transform; introducing one on - the windowed side would be the silent-wrongness this gate exists - to catch. - """ - if "transform" not in full.attrs: - assert "transform" not in windowed.attrs, ( - f"release gate: source has no georef and the unwindowed read " - f"emits no ``transform``, but the windowed read invented one: " - f"{windowed.attrs.get('transform')!r}" - ) - return - t_full = tuple(full.attrs["transform"]) - assert "transform" in windowed.attrs, ( - f"release gate: unwindowed read carries ``transform`` " - f"({t_full!r}) but the windowed read dropped it" - ) - t_win = tuple(windowed.attrs["transform"]) - assert len(t_full) == 6, ( - f"release gate: full-read transform is not a 6-tuple: {t_full!r}" - ) - assert len(t_win) == 6, ( - f"release gate: windowed-read transform is not a 6-tuple: {t_win!r}" - ) - expected = _expected_window_transform(t_full, col_off, row_off) - assert t_win == expected, ( - f"release gate: windowed transform does not equal " - f"T_full * Affine.translation(col_off={col_off}, row_off={row_off})\n" - f" T_full = {t_full!r}\n" - f" T_window = {t_win!r}\n" - f" expected = {expected!r}" - ) - - -def _assert_canonical_attrs_unchanged(windowed, full): - """The non-transform canonical attrs must match the unwindowed read. - - A window slices pixels; it does not redefine CRS, nodata sentinel, - georef status, or raster type. Drift on any of these is the exact - silent-wrongness the epic calls out. ``masked_nodata`` is checked - structurally because the flag is by-design data-dependent. - """ - for key in _NON_TRANSFORM_ATTRS_VALUE_EQUAL: - if key not in full.attrs: - assert key not in windowed.attrs, ( - f"release gate: windowed read introduced attrs[{key!r}] " - f"that the unwindowed read does not have" - ) - continue - assert key in windowed.attrs, ( - f"release gate: windowed read dropped attrs[{key!r}] that the " - f"unwindowed read carries" - ) - full_val = full.attrs[key] - win_val = windowed.attrs[key] - # NaN-aware compare. Try ``np.isnan`` on a scalar before - # falling back to ``==``: a Python float NaN and a - # ``np.float32`` NaN both report True under ``np.isnan`` but - # only the Python float passes ``isinstance(x, float)``, and - # we don't want a future backend that returns a numpy-scalar - # nodata to silently slip into the ``==`` branch where - # NaN != NaN flips the test from "checked equal" to "always - # fails". - try: - full_is_nan = bool(np.isnan(full_val)) - except (TypeError, ValueError): - full_is_nan = False - if full_is_nan: - try: - win_is_nan = bool(np.isnan(win_val)) - except (TypeError, ValueError): - win_is_nan = False - assert win_is_nan, ( - f"release gate: NaN-valued attrs[{key!r}] did not survive " - f"the windowed read: full={full_val!r} window={win_val!r}" - ) - else: - assert win_val == full_val, ( - f"release gate: windowed read changed attrs[{key!r}]: " - f"full={full_val!r} window={win_val!r}" - ) - for key in _NON_TRANSFORM_ATTRS_STRUCTURAL: - full_present = key in full.attrs - win_present = key in windowed.attrs - assert full_present == win_present, ( - f"release gate: windowed read changed presence of " - f"attrs[{key!r}]: full_has={full_present} " - f"window_has={win_present}" - ) - if full_present: - assert isinstance(windowed.attrs[key], bool), ( - f"release gate: attrs[{key!r}] is not a bool on the " - f"windowed read: {windowed.attrs[key]!r}" - ) - - -# --------------------------------------------------------------------------- -# Tests -# --------------------------------------------------------------------------- - -@pytest.mark.release_gate -@pytest.mark.parametrize("window", _WINDOWS) -@pytest.mark.parametrize("corpus_file", _CORPUS, indirect=True) -@pytest.mark.parametrize("reader", _READERS) -def test_release_gate_windowed_read_shape(corpus_file, reader, window): - """The returned shape equals the window's ``(height, width)``.""" - row_off, col_off, row_stop, col_stop = window - out = reader(corpus_file, window=window) - _assert_shape(out, - expected_h=row_stop - row_off, - expected_w=col_stop - col_off) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("window", _WINDOWS) -@pytest.mark.parametrize("corpus_file", _CORPUS, indirect=True) -@pytest.mark.parametrize("reader", _READERS) -def test_release_gate_windowed_read_coords_slice(corpus_file, reader, window): - """``coords['y'/'x']`` equals the matching slice of the full coords.""" - row_off, col_off, row_stop, col_stop = window - full = reader(corpus_file) - out = reader(corpus_file, window=window) - _assert_coords_slice( - out, full, - row_off=row_off, col_off=col_off, - height=row_stop - row_off, width=col_stop - col_off, - ) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("window", _WINDOWS) -@pytest.mark.parametrize("corpus_file", _CORPUS, indirect=True) -@pytest.mark.parametrize("reader", _READERS) -def test_release_gate_windowed_read_transform_shifted( - corpus_file, reader, window, -): - """``attrs['transform']`` equals ``T_full * translation(col, row)``.""" - row_off, col_off, _row_stop, _col_stop = window - full = reader(corpus_file) - out = reader(corpus_file, window=window) - _assert_transform_shifted(out, full, col_off=col_off, row_off=row_off) - - -@pytest.mark.release_gate -@pytest.mark.parametrize("window", _WINDOWS) -@pytest.mark.parametrize("corpus_file", _CORPUS, indirect=True) -@pytest.mark.parametrize("reader", _READERS) -def test_release_gate_windowed_read_canonical_attrs_unchanged( - corpus_file, reader, window, -): - """The non-transform canonical attrs match the unwindowed read.""" - full = reader(corpus_file) - out = reader(corpus_file, window=window) - _assert_canonical_attrs_unchanged(out, full) From 790b804240ace6d7728f7c878837dffd2dfae698 Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Mon, 25 May 2026 20:30:51 -0700 Subject: [PATCH 2/3] Address review: scope optional-dep skips, rename fixture, drop stale xfail (#2403) * Replace module-level ``pytest.importorskip("dask")`` and ``pytest.importorskip("rasterio")`` with per-test ``skipif`` decorators sourced from a single set of constants at the top of the file. The previous module-level gates would skip the entire 159-test registry on a minimal install; the pure-numpy local-read, local-write, codec, and attrs-contract gates now run regardless of whether dask or rasterio is present. The rasterio overview helpers import rasterio (and ``Resampling``) lazily inside the helpers so the bare ``from`` import no longer races the skip. * The windowed-shifted-transform parity tests parametrize over ``(eager, dask)``; the dask reader param carries ``marks=_requires_dask`` so the eager cell still runs without dask. * Rename ``_wsp_corpus_file`` -> ``wsp_corpus_file`` to match the no-leading-underscore convention used by every other fixture in the suite. * Drop the stale xfail on ``test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme``. PR #2326 (sub-PR 5 of #2321) landed the case-insensitive scheme check, so uppercase HTTP now raises ``UnsafeURLError`` for real. Verified: ``pytest -m release_gate`` returns 156 passed + 3 xfailed (the xpass became a clean pass). ``-m release_gate`` from the wider tests root still selects exactly the 159 tests in the consolidated file. --- .../release_gates/test_stable_features.py | 107 ++++++++++++------ 1 file changed, 70 insertions(+), 37 deletions(-) diff --git a/xrspatial/geotiff/tests/release_gates/test_stable_features.py b/xrspatial/geotiff/tests/release_gates/test_stable_features.py index 0cdd8258..67688ddc 100644 --- a/xrspatial/geotiff/tests/release_gates/test_stable_features.py +++ b/xrspatial/geotiff/tests/release_gates/test_stable_features.py @@ -55,6 +55,29 @@ import pytest import xarray as xr +# Optional-dependency gates. Each section below decorates only the +# tests that actually touch the optional backend, instead of +# module-level ``importorskip`` which would skip the entire 159-test +# file (including pure-numpy gates) on a minimal install. +_HAS_DASK = importlib.util.find_spec("dask") is not None +_HAS_DASK_ARRAY = importlib.util.find_spec("dask.array") is not None +_HAS_RASTERIO = importlib.util.find_spec("rasterio") is not None +_HAS_TIFFFILE = importlib.util.find_spec("tifffile") is not None + +_requires_dask = pytest.mark.skipif( + not _HAS_DASK, reason="dask is required for this release gate", +) +_requires_rasterio = pytest.mark.skipif( + not _HAS_RASTERIO, + reason="rasterio is required for the overview / sidecar gate", +) +_requires_rasterio_and_dask = pytest.mark.skipif( + not (_HAS_RASTERIO and _HAS_DASK_ARRAY), + reason=( + "rasterio and dask.array are required for the overview / sidecar gate" + ), +) + from xrspatial.geotiff import ( SUPPORTED_FEATURES, UnsafeURLError, @@ -739,8 +762,8 @@ def test_release_gate_windowed_read_full_extent_matches_unwindowed( # Dask reads of a local GeoTIFF must return the same pixels and canonical # attrs as the eager (numpy) read. The small one-shot gate below is the # release-engineer-facing test; the wider parity matrix lives elsewhere. - -pytest.importorskip("dask") +# Per-test ``_requires_dask`` skip so a minimal install still runs the +# pure-numpy gates above. def _dask_parity_write_known_good(path: str) -> np.ndarray: @@ -765,6 +788,7 @@ def _dask_parity_write_known_good(path: str) -> np.ndarray: @pytest.mark.release_gate +@_requires_dask def test_release_gate_dask_read_matches_eager_pixels(tmp_path) -> None: """The dask backend returns the same pixels as the eager backend.""" path = str(tmp_path / "release_gate_dask_parity_pixels.tif") @@ -795,6 +819,7 @@ def test_release_gate_dask_read_matches_eager_pixels(tmp_path) -> None: @pytest.mark.release_gate +@_requires_dask def test_release_gate_dask_read_matches_eager_attrs(tmp_path) -> None: """The dask backend produces the same canonical attrs as eager.""" path = str(tmp_path / "release_gate_dask_parity_attrs.tif") @@ -831,6 +856,7 @@ def test_release_gate_dask_read_matches_eager_attrs(tmp_path) -> None: @pytest.mark.release_gate +@_requires_dask def test_release_gate_dask_read_is_lazy(tmp_path) -> None: """A ``chunks=`` read produces a dask-backed DataArray. @@ -1000,6 +1026,7 @@ def _eager_dask_assert_release_attrs_equal( @pytest.mark.release_gate +@_requires_dask @pytest.mark.parametrize("fixture_id, open_kwargs", _EAGER_DASK_CORPUS) def test_release_gate_eager_dask_full_parity( fixture_id: str, open_kwargs: dict, @@ -1397,10 +1424,10 @@ def test_release_gate_codec_round_trip_stable_set_matches_supported_features() - # metadata set, and ``transform`` scales pixel size by the level factor # while keeping the origin fixed. -rasterio = pytest.importorskip("rasterio") -pytest.importorskip("dask.array") - -from rasterio.enums import Resampling # noqa: E402 +# rasterio and dask.array are required only by this section; the +# imports live inside the helpers so a minimal install still runs the +# pure-numpy gates above. ``_requires_rasterio`` / ``_requires_rasterio_and_dask`` +# skip the affected tests cleanly. _OVERVIEW_BASE_SIZE = 64 _OVERVIEW_FACTORS = (2, 4) @@ -1440,6 +1467,8 @@ def _overview_unique_tmp_path(tmp_path, label: str) -> str: def _overview_write_internal_cog(path: str) -> None: """Write a COG with base + internal overviews at factors 2 and 4.""" + import rasterio # gated by ``_requires_rasterio`` on every caller + da = _overview_make_raster() to_geotiff( da, path, @@ -1460,6 +1489,9 @@ def _overview_write_internal_cog(path: str) -> None: def _overview_write_external_sidecar(path: str) -> None: """Write a tiled TIFF + ``.ovr`` sidecar at factors 2 and 4.""" + import rasterio # gated by ``_requires_rasterio`` on every caller + from rasterio.enums import Resampling + da = _overview_make_raster() to_geotiff( da, path, @@ -1554,6 +1586,7 @@ def _overview_factors_by_level() -> dict: @pytest.mark.release_gate +@_requires_rasterio_and_dask @pytest.mark.parametrize("reader", ["eager", "dask"]) def test_release_gate_cog_internal_overview_metadata_survives( tmp_path, reader, @@ -1579,6 +1612,7 @@ def test_release_gate_cog_internal_overview_metadata_survives( @pytest.mark.release_gate +@_requires_rasterio_and_dask @pytest.mark.parametrize("reader", ["eager", "dask"]) def test_release_gate_cog_internal_overview_transform_scales( tmp_path, reader, @@ -1599,6 +1633,7 @@ def test_release_gate_cog_internal_overview_transform_scales( @pytest.mark.release_gate +@_requires_rasterio def test_release_gate_cog_internal_overview_shape_matches_factors( tmp_path, ) -> None: @@ -1618,6 +1653,7 @@ def test_release_gate_cog_internal_overview_shape_matches_factors( @pytest.mark.release_gate +@_requires_rasterio_and_dask @pytest.mark.parametrize("reader", ["eager", "dask"]) def test_release_gate_sidecar_overview_metadata_survives( tmp_path, reader, @@ -1645,6 +1681,7 @@ def test_release_gate_sidecar_overview_metadata_survives( @pytest.mark.release_gate +@_requires_rasterio_and_dask @pytest.mark.parametrize("reader", ["eager", "dask"]) def test_release_gate_sidecar_overview_transform_scales( tmp_path, reader, @@ -1665,6 +1702,7 @@ def test_release_gate_sidecar_overview_transform_scales( @pytest.mark.release_gate +@_requires_rasterio def test_release_gate_sidecar_overview_shape_matches_factors( tmp_path, ) -> None: @@ -1684,6 +1722,7 @@ def test_release_gate_sidecar_overview_shape_matches_factors( @pytest.mark.release_gate +@_requires_rasterio_and_dask @pytest.mark.parametrize("reader", ["eager", "dask"]) def test_release_gate_internal_vs_sidecar_metadata_agree( tmp_path, reader, @@ -1738,9 +1777,8 @@ def test_release_gate_internal_vs_sidecar_metadata_agree( # row_off)`` exactly, and the canonical non-transform release attrs # unchanged. -_WSP_HAS_TIFFFILE = importlib.util.find_spec("tifffile") is not None _wsp_skip_no_tifffile = pytest.mark.skipif( - not _WSP_HAS_TIFFFILE, + not _HAS_TIFFFILE, reason="tifffile required for MinIsWhite fixture", ) @@ -1856,7 +1894,7 @@ def _wsp_write_uint8_miniswhite(path: Path) -> None: @pytest.fixture -def _wsp_corpus_file(tmp_path, request): +def wsp_corpus_file(tmp_path, request): """Write a single fixture file and return its on-disk path.""" builder = request.param tag = uuid.uuid4().hex[:8] @@ -1887,7 +1925,7 @@ def _wsp_open_dask(path, *, window=None): _WSP_READERS = ( pytest.param(_wsp_open_eager, id="eager"), - pytest.param(_wsp_open_dask, id="dask"), + pytest.param(_wsp_open_dask, id="dask", marks=_requires_dask), ) @@ -2005,12 +2043,12 @@ def _wsp_assert_canonical_attrs_unchanged(windowed, full): @pytest.mark.release_gate @pytest.mark.parametrize("window", _WSP_WINDOWS) -@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("wsp_corpus_file", _WSP_CORPUS, indirect=True) @pytest.mark.parametrize("reader", _WSP_READERS) -def test_release_gate_windowed_read_shape(_wsp_corpus_file, reader, window): +def test_release_gate_windowed_read_shape(wsp_corpus_file, reader, window): """The returned shape equals the window's ``(height, width)``.""" row_off, col_off, row_stop, col_stop = window - out = reader(_wsp_corpus_file, window=window) + out = reader(wsp_corpus_file, window=window) _wsp_assert_shape( out, expected_h=row_stop - row_off, @@ -2020,15 +2058,15 @@ def test_release_gate_windowed_read_shape(_wsp_corpus_file, reader, window): @pytest.mark.release_gate @pytest.mark.parametrize("window", _WSP_WINDOWS) -@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("wsp_corpus_file", _WSP_CORPUS, indirect=True) @pytest.mark.parametrize("reader", _WSP_READERS) def test_release_gate_windowed_read_coords_slice( - _wsp_corpus_file, reader, window, + wsp_corpus_file, reader, window, ): """``coords['y'/'x']`` equals the matching slice of the full coords.""" row_off, col_off, row_stop, col_stop = window - full = reader(_wsp_corpus_file) - out = reader(_wsp_corpus_file, window=window) + full = reader(wsp_corpus_file) + out = reader(wsp_corpus_file, window=window) _wsp_assert_coords_slice( out, full, row_off=row_off, col_off=col_off, @@ -2038,15 +2076,15 @@ def test_release_gate_windowed_read_coords_slice( @pytest.mark.release_gate @pytest.mark.parametrize("window", _WSP_WINDOWS) -@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("wsp_corpus_file", _WSP_CORPUS, indirect=True) @pytest.mark.parametrize("reader", _WSP_READERS) def test_release_gate_windowed_read_transform_shifted( - _wsp_corpus_file, reader, window, + wsp_corpus_file, reader, window, ): """``attrs['transform']`` equals ``T_full * translation(col, row)``.""" row_off, col_off, _row_stop, _col_stop = window - full = reader(_wsp_corpus_file) - out = reader(_wsp_corpus_file, window=window) + full = reader(wsp_corpus_file) + out = reader(wsp_corpus_file, window=window) _wsp_assert_transform_shifted( out, full, col_off=col_off, row_off=row_off, ) @@ -2054,14 +2092,14 @@ def test_release_gate_windowed_read_transform_shifted( @pytest.mark.release_gate @pytest.mark.parametrize("window", _WSP_WINDOWS) -@pytest.mark.parametrize("_wsp_corpus_file", _WSP_CORPUS, indirect=True) +@pytest.mark.parametrize("wsp_corpus_file", _WSP_CORPUS, indirect=True) @pytest.mark.parametrize("reader", _WSP_READERS) def test_release_gate_windowed_read_canonical_attrs_unchanged( - _wsp_corpus_file, reader, window, + wsp_corpus_file, reader, window, ): """The non-transform canonical attrs match the unwindowed read.""" - full = reader(_wsp_corpus_file) - out = reader(_wsp_corpus_file, window=window) + full = reader(wsp_corpus_file) + out = reader(wsp_corpus_file, window=window) _wsp_assert_canonical_attrs_unchanged(out, full) @@ -2349,6 +2387,7 @@ def test_release_gate_negative_rotated_eager( @pytest.mark.release_gate +@_requires_dask def test_release_gate_negative_rotated_dask( _neg_rotated_geotiff_path, ) -> None: @@ -2541,19 +2580,13 @@ def test_release_gate_http_ssrf_rejects_loopback() -> None: @pytest.mark.release_gate -@pytest.mark.xfail( - reason=( - "Locks in once sub-PR 5 of #2321 (PR #2326) lands. Until then, " - "uppercase HTTP slips past the SSRF check and falls through to " - "fsspec, which raises a generic ValueError. Once #2326 is merged, " - "remove this xfail marker so the release gate enforces the " - "promise." - ), - strict=False, - raises=(ValueError, UnsafeURLError), -) def test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme() -> None: - """Uppercase HTTP scheme must take the same SSRF path.""" + """Uppercase HTTP scheme must take the same SSRF path. + + PR #2326 (sub-PR 5 of #2321) made the SSRF check case-insensitive, + so the xfail this test originally carried is gone: uppercase HTTP + now raises :class:`UnsafeURLError` like its lowercase sibling. + """ with pytest.raises(UnsafeURLError): open_geotiff("HTTP://127.0.0.1/does-not-matter.tif") From 3cf7558c76bc7984efe78b3ef6fdadebf5c3e8bd Mon Sep 17 00:00:00 2001 From: Brendan Collins Date: Mon, 25 May 2026 20:32:05 -0700 Subject: [PATCH 3/3] Remove CLUSTER_AUDIT_PR10.md before merge per epic #2390 protocol --- xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md | 98 ------------------- 1 file changed, 98 deletions(-) delete mode 100644 xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md diff --git a/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md deleted file mode 100644 index d5f643b8..00000000 --- a/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR10.md +++ /dev/null @@ -1,98 +0,0 @@ -# CLUSTER_AUDIT_PR10 — Release-gate registry - -This audit table maps every test currently living in a -`test_release_gate_*.py` file under `xrspatial/geotiff/tests/` to its new -home inside the single consolidated registry, -`release_gates/test_stable_features.py`. Deleted before merge per the -epic protocol (see `xarray-contrib/xarray-spatial#2390`). - -## Inputs - -13 source files, 159 tests collected, 134 of those previously carried -`@pytest.mark.release_gate`. The remaining 25 lived in -`test_release_gate_*.py` files but did not carry the marker; the epic -specifies all such tests fold in and pick up the marker. - -## Mapping - -| Old file:test | New `release_gates/test_stable_features.py::test_id` | Notes | -|---|---|---| -| `test_release_gate_local_read.py::test_release_gate_local_read_pixels` | `test_release_gate_local_read_pixels` | unchanged | -| `test_release_gate_local_read.py::test_release_gate_local_read_crs` | `test_release_gate_local_read_crs` | unchanged | -| `test_release_gate_local_read.py::test_release_gate_local_read_transform` | `test_release_gate_local_read_transform` | unchanged | -| `test_release_gate_local_read.py::test_release_gate_local_read_nodata` | `test_release_gate_local_read_nodata` | unchanged | -| `test_release_gate_local_write.py::test_release_gate_local_write_round_trips_pixels` | `test_release_gate_local_write_round_trips_pixels` | unchanged | -| `test_release_gate_local_write.py::test_release_gate_local_write_preserves_crs` | `test_release_gate_local_write_preserves_crs` | unchanged | -| `test_release_gate_local_write.py::test_release_gate_local_write_preserves_transform` | `test_release_gate_local_write_preserves_transform` | unchanged | -| `test_release_gate_local_write.py::test_release_gate_local_write_preserves_nodata` | `test_release_gate_local_write_preserves_nodata` | unchanged | -| `test_release_gate_codecs.py::test_release_gate_codec_round_trip_uint16[codec]` (5) | `test_release_gate_codec_round_trip_uint16[codec]` | parametrized over `STABLE_LOSSLESS_CODECS` | -| `test_release_gate_codecs.py::test_release_gate_codec_round_trip_float32[codec]` (5) | `test_release_gate_codec_round_trip_float32[codec]` | parametrized over `STABLE_LOSSLESS_CODECS` | -| `test_release_gate_codecs.py::test_release_gate_codec_stable_set_matches_supported_features` | `test_release_gate_codec_stable_set_matches_supported_features` | unchanged | -| `test_release_gate_cog.py::test_release_gate_cog_round_trips_pixels[codec]` (5) | `test_release_gate_cog_round_trips_pixels[codec]` | shared `STABLE_LOSSLESS_CODECS` constant, no cross-file import | -| `test_release_gate_cog.py::test_release_gate_cog_preserves_crs_transform[codec]` (5) | `test_release_gate_cog_preserves_crs_transform[codec]` | unchanged | -| `test_release_gate_cog.py::test_release_gate_cog_preserves_nodata[codec]` (5) | `test_release_gate_cog_preserves_nodata[codec]` | unchanged | -| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_returns_subset` | `test_release_gate_windowed_read_returns_subset` | unchanged | -| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_preserves_crs` | `test_release_gate_windowed_read_preserves_crs` | unchanged | -| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_shifts_transform_origin` | `test_release_gate_windowed_read_shifts_transform_origin` | unchanged | -| `test_release_gate_windowed_read.py::test_release_gate_windowed_read_full_extent_matches_unwindowed` | `test_release_gate_windowed_read_full_extent_matches_unwindowed` | unchanged | -| `test_release_gate_dask_parity.py::test_release_gate_dask_read_matches_eager_pixels` | `test_release_gate_dask_read_matches_eager_pixels` | unchanged | -| `test_release_gate_dask_parity.py::test_release_gate_dask_read_matches_eager_attrs` | `test_release_gate_dask_read_matches_eager_attrs` | unchanged | -| `test_release_gate_dask_parity.py::test_release_gate_dask_read_is_lazy` | `test_release_gate_dask_read_is_lazy` | unchanged | -| `test_release_gate_eager_dask_parity_2341.py::test_release_gate_eager_dask_full_parity[fixture]` (4) | `test_release_gate_eager_dask_full_parity[fixture]` | corpus list preserved | -| `test_release_gate_eager_dask_parity_2341.py::test_release_gate_corpus_is_non_empty` | `test_release_gate_corpus_is_non_empty` | now carries `@pytest.mark.release_gate` (previously unmarked despite living in a `test_release_gate_*.py` file) | -| `test_release_gate_attrs_contract.py::test_release_gate_attrs_canonical_keys_present` | `test_release_gate_attrs_canonical_keys_present` | unchanged | -| `test_release_gate_attrs_contract.py::test_release_gate_attrs_georef_status_full` | `test_release_gate_attrs_georef_status_full` | unchanged | -| `test_release_gate_attrs_contract.py::test_release_gate_attrs_contract_version_is_int` | `test_release_gate_attrs_contract_version_is_int` | unchanged | -| `test_release_gate_attrs_contract.py::test_release_gate_attrs_round_trip_preserves_crs_transform_nodata` | `test_release_gate_attrs_round_trip_preserves_crs_transform_nodata` | unchanged | -| `test_release_gate_codec_round_trip_2341.py::test_release_gate_codec_round_trip[codec-dtype]` (20) | `test_release_gate_codec_round_trip[codec-dtype]` | unchanged | -| `test_release_gate_codec_round_trip_2341.py::test_release_gate_codec_round_trip_stable_set_matches_supported_features` | `test_release_gate_codec_round_trip_stable_set_matches_supported_features` | unchanged | -| `test_release_gate_overview_sidecar_metadata_2341.py::test_cog_internal_overview_metadata_survives[reader]` (2) | `test_release_gate_cog_internal_overview_metadata_survives[reader]` | renamed for the `release_gate_` test-name prefix; now carries `@pytest.mark.release_gate` | -| `test_release_gate_overview_sidecar_metadata_2341.py::test_cog_internal_overview_transform_scales[reader]` (2) | `test_release_gate_cog_internal_overview_transform_scales[reader]` | renamed; marker added | -| `test_release_gate_overview_sidecar_metadata_2341.py::test_cog_internal_overview_shape_matches_factors` | `test_release_gate_cog_internal_overview_shape_matches_factors` | renamed; marker added | -| `test_release_gate_overview_sidecar_metadata_2341.py::test_sidecar_overview_metadata_survives[reader]` (2) | `test_release_gate_sidecar_overview_metadata_survives[reader]` | renamed; marker added | -| `test_release_gate_overview_sidecar_metadata_2341.py::test_sidecar_overview_transform_scales[reader]` (2) | `test_release_gate_sidecar_overview_transform_scales[reader]` | renamed; marker added | -| `test_release_gate_overview_sidecar_metadata_2341.py::test_sidecar_overview_shape_matches_factors` | `test_release_gate_sidecar_overview_shape_matches_factors` | renamed; marker added | -| `test_release_gate_overview_sidecar_metadata_2341.py::test_internal_vs_sidecar_metadata_agree[reader]` (2) | `test_release_gate_internal_vs_sidecar_metadata_agree[reader]` | renamed; marker added | -| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_shape[r-corpus-window]` (16) | `test_release_gate_windowed_read_shape[r-corpus-window]` | corpus fixture renamed to `_wsp_corpus_file`; same IDs | -| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_coords_slice[r-corpus-window]` (16) | `test_release_gate_windowed_read_coords_slice[r-corpus-window]` | same | -| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_transform_shifted[r-corpus-window]` (16) | `test_release_gate_windowed_read_transform_shifted[r-corpus-window]` | same | -| `test_release_gate_windowed_reads_2341.py::test_release_gate_windowed_read_canonical_attrs_unchanged[r-corpus-window]` (16) | `test_release_gate_windowed_read_canonical_attrs_unchanged[r-corpus-window]` | same | -| `test_release_gate_negative_2341.py::test_release_gate_negative_conflicting_aux_xml_crs` | `test_release_gate_negative_conflicting_aux_xml_crs` | unchanged; remains `xfail strict=False` | -| `test_release_gate_negative_2341.py::test_release_gate_negative_integer_nodata_float_promoted` | `test_release_gate_negative_integer_nodata_float_promoted` | unchanged | -| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_eager` | `test_release_gate_negative_rotated_eager` | now carries `@pytest.mark.release_gate` (previously unmarked) | -| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_dask` | `test_release_gate_negative_rotated_dask` | marker added | -| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_windowed` | `test_release_gate_negative_rotated_windowed` | marker added | -| `test_release_gate_negative_2341.py::test_release_gate_negative_rotated_gpu` | `test_release_gate_negative_rotated_gpu` | marker added; `requires_gpu` imported from `_helpers.markers` instead of the slim conftest re-export | -| `test_release_gate_negative_2341.py::test_release_gate_negative_mixed_tier_vrt_children` | `test_release_gate_negative_mixed_tier_vrt_children` | unchanged | -| `test_release_gate_2321.py::test_release_gate_cites_only_existing_test_files` | `test_release_gate_cites_only_existing_test_files` | now carries `@pytest.mark.release_gate`; self-reference path updated to `release_gates/test_stable_features.py` | -| `test_release_gate_2321.py::test_release_gate_lists_every_promised_supported_feature` | `test_release_gate_lists_every_promised_supported_feature` | marker added | -| `test_release_gate_2321.py::test_release_gate_http_ssrf_rejects_loopback` | `test_release_gate_http_ssrf_rejects_loopback` | marker added | -| `test_release_gate_2321.py::test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme` | `test_release_gate_http_ssrf_rejects_loopback_uppercase_scheme` | marker added; xfail kept | -| `test_release_gate_2321.py::test_release_gate_vrt_rows_point_at_real_test_functions` | `test_release_gate_vrt_rows_point_at_real_test_functions` | marker added | - -## Helper-function collisions - -Two source files defined `_write_known_good` and two defined -`_make_data_array`. Helpers carry section prefixes in the consolidated -file (`_local_read_write_known_good`, `_local_write_make_data_array`, -`_dask_parity_write_known_good`, `_attrs_write_known_good`, -`_cog_make_data_array`, etc.) so the consolidation does not introduce -cross-section coupling. - -## Drops / dismissals - -None. Every test from every folded file moved. The `release_gate` -marker now covers all 159 tests rather than the previous 134. - -## Verification - -``` -pytest xrspatial/geotiff/tests/release_gates/ -v -m release_gate -# 155 passed, 3 xfailed, 1 xpassed -pytest xrspatial/geotiff/tests/ -m release_gate -v -# same 159 tests selected -- no other file carries the marker now -``` - -`-m release_gate` from the wider tests root resolves to the single -registry file. Deletion: this file is removed in the final commit -before merge.