Fix read_geotiff_gpu dim mismatch on stripped multi-band TIFFs#1525
Merged
brendancol merged 1 commit intoxarray-contrib:mainxarray-contrib/xarray-spatial:mainfrom May 9, 2026
brendancol:fix/gpu-stripped-multiband-dimsbrendancol/xarray-spatial:fix/gpu-stripped-multiband-dimsCopy head branch name to clipboard
Merged
Fix read_geotiff_gpu dim mismatch on stripped multi-band TIFFs#1525brendancol merged 1 commit intoxarray-contrib:mainxarray-contrib/xarray-spatial:mainfrom brendancol:fix/gpu-stripped-multiband-dimsbrendancol/xarray-spatial:fix/gpu-stripped-multiband-dimsCopy head branch name to clipboard
brendancol merged 1 commit intoxarray-contrib:mainxarray-contrib/xarray-spatial:mainfrom
brendancol:fix/gpu-stripped-multiband-dimsbrendancol/xarray-spatial:fix/gpu-stripped-multiband-dimsCopy head branch name to clipboard
Conversation
The stripped fallback inside read_geotiff_gpu hardcoded dims=['y','x']
even when the CPU read returned a 3-D (y, x, band) array for multi-band
stripped files. That raised a ValueError about dimension count.
Mirror the tiled-branch logic: pick ('y', 'x', 'band') with a band coord
when the CPU array is 3-D, otherwise keep the 2-D ('y', 'x') path.
Adds xrspatial/geotiff/tests/test_gpu_stripped_multiband.py covering the
3-band uint8, 2-band uint16, and single-band sanity cases. Audit pass
following PRs xarray-contrib#1523/xarray-contrib#1524.
6 tasks
There was a problem hiding this comment.
Pull request overview
This PR fixes a shape/dimension mismatch in read_geotiff_gpu for stripped (non-tiled) multi-band GeoTIFFs. Previously, the stripped-file branch always constructed a 2-D DataArray with dims=['y', 'x'], even when the CPU reader returned a 3-D (y, x, band) array, causing an xarray ValueError. The change makes the stripped path mirror the already-correct tiled path behavior.
Changes:
- Update
read_geotiff_gpustripped-file fallback to setdims=('y','x','band')and add abandcoord when the uploaded GPU array is 3-D. - Add GPU regression tests covering stripped multi-band round-trips (uint8/uint16) and ensuring single-band remains 2-D.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Fix stripped (non-tiled) read_geotiff_gpu DataArray construction to handle 3-D multi-band arrays consistently with the tiled path. |
xrspatial/geotiff/tests/test_gpu_stripped_multiband.py |
Add regression tests for stripped multi-band GPU reads and a sanity check for single-band behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
brendancol
added a commit
to brendancol/xarray-spatial
that referenced
this pull request
May 9, 2026
Two related multi-band accuracy bugs from the geotiff audit: A2 (HIGH): GPU tiled reads of planar=2 files (separate band planes) returned wrong pixel values. gpu_decode_tiles iterates a single TileOffsets list with bytes_per_pixel = itemsize * samples and assumes chunky interleave. For planar=2, each band has its own contiguous run of tiles in TileOffsets, so the kernel read across band boundaries and produced garbage. read_geotiff_gpu now detects planar==2 and decodes each band's tile slab as a single-band image, then stacks the results into (H, W, samples). The chunky planar=1 path is unchanged. A3 (MEDIUM): The GPU stripped multi-band fallback hardcoded dims=['y','x'] even for 3-D CPU reads, mirroring the pre-xarray-contrib#1525 bug. The same fix lands here so the planar=2 stripped tests pass; this matches the change in PR xarray-contrib#1525 verbatim. Both paths assert the assembled shape equals (H, W, samples) before returning, so a future kernel regression surfaces immediately rather than producing garbage. Adds xrspatial/geotiff/tests/test_planar_multiband.py covering 53 combinations: planar in {separate, contig} x layout in {tiled, stripped} x bands in {2, 3, 4} x dtype in {uint8, uint16} x backend in {CPU, GPU}, plus single-band sanity checks and an explicit A3 repro. GPU tests skip cleanly when CUDA is unavailable. Seeds are fixed. Refs audit issues A2, A3.
brendancol
added a commit
that referenced
this pull request
May 9, 2026
* Handle PlanarConfiguration=2 across CPU + GPU multi-band reads Two related multi-band accuracy bugs from the geotiff audit: A2 (HIGH): GPU tiled reads of planar=2 files (separate band planes) returned wrong pixel values. gpu_decode_tiles iterates a single TileOffsets list with bytes_per_pixel = itemsize * samples and assumes chunky interleave. For planar=2, each band has its own contiguous run of tiles in TileOffsets, so the kernel read across band boundaries and produced garbage. read_geotiff_gpu now detects planar==2 and decodes each band's tile slab as a single-band image, then stacks the results into (H, W, samples). The chunky planar=1 path is unchanged. A3 (MEDIUM): The GPU stripped multi-band fallback hardcoded dims=['y','x'] even for 3-D CPU reads, mirroring the pre-#1525 bug. The same fix lands here so the planar=2 stripped tests pass; this matches the change in PR #1525 verbatim. Both paths assert the assembled shape equals (H, W, samples) before returning, so a future kernel regression surfaces immediately rather than producing garbage. Adds xrspatial/geotiff/tests/test_planar_multiband.py covering 53 combinations: planar in {separate, contig} x layout in {tiled, stripped} x bands in {2, 3, 4} x dtype in {uint8, uint16} x backend in {CPU, GPU}, plus single-band sanity checks and an explicit A3 repro. GPU tests skip cleanly when CUDA is unavailable. Seeds are fixed. Refs audit issues A2, A3. * Address Copilot review on #1532 Seven findings, all acted on: - Sparse-tile bypass: planar=2 branch ran before the has_sparse_tile check, so a planar=2 file with sparse tiles would hit the GPU path that doesn't handle sparse blocks. Add `not has_sparse_tile` to the planar=2 gate so sparse always routes to the CPU reader. - Strict offset-count check: planar=2 required len(offsets) == tiles_per_band * samples, but validate_tile_layout only requires >=. Files with extra trailing entries that pass the CPU reader were rejected here. Relax to a minimum-length check and slice the first tiles_per_band * samples entries. - Auto-mode fallback semantics: stage-2 GPU failure in _gpu_decode_single_band_tiles raised unconditionally instead of letting the caller fall back to a whole-image CPU read like the chunky path does. Helper now returns None on stage-2 auto failure; caller catches the signal, runs read_to_array + cupy.asarray, and matches the chunky path's gpu='auto' contract. Strict mode still re-raises. - Per-band file re-reads: the helper called src.read_all() for each band's stage-2 fallback, doing N redundant full-file reads on a planar=2 image with N bands. Caller now does read_all() once before the band loop and passes the bytes via a new shared_data parameter. - Unused params: drop overview_level and cupy from the helper signature; neither was referenced. - Two assert statements for runtime shape validation replaced with explicit `if ...: raise RuntimeError(...)` so the checks survive python -O. 53 planar tests + 303 multi/band/planar/strip/tile/sparse regression tests still pass. * Switch shared file read to lazy reader The previous fix did read_all() unconditionally before the band loop to amortise a stage-2 fallback across bands. When every band's GDS path succeeds, those bytes are never used. Replace the eager read_all() with a closure that materialises the buffer on first call and caches it, so: - Every-band-GDS-succeeds case: 0 read_all() calls (was 1). - Any-band-falls-back case: exactly 1 read_all() call regardless of how many bands fall back (was N). Same Copilot-finding-#4 fix, just lazier.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The stripped-fallback branch in
read_geotiff_gpuhardcodeddims=['y','x']even when the underlying CPU reader returned a 3-D(y, x, band)array for multi-band stripped files. That raisedValueError: dimensions ('y', 'x') must have the same length as the number of data dimensions, ndim=3.The fix mirrors the tiled-branch logic already used a few lines later in the same function: when
arr_gpu.ndim == 3use('y', 'x', 'band')and add abandcoord; otherwise keep('y', 'x').Reproducer
Tests
xrspatial/geotiff/tests/test_gpu_stripped_multiband.pycovers:Tests use the same
cupy + CUDAskip pattern astest_gpu_byteswap_1508.py.Context
Audit pass following PRs #1523 / #1524 on the geotiff GPU read path.
Test plan