Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
120 commits
Select commit Hold shift + click to select a range
4f63867
Add prospectus
maxrjones Mar 9, 2026
fb96207
Initial prospectus POC
maxrjones Mar 9, 2026
b3e72ec
V2 prospectus
maxrjones Mar 9, 2026
d6d551a
V3 prospectus
maxrjones Mar 9, 2026
8b8af74
Fastforward POC to V3
maxrjones Mar 9, 2026
784f4e7
Remove prospectus
maxrjones Mar 9, 2026
f1a1bc3
Fix sharding
maxrjones Mar 10, 2026
30fa867
Fix bugs
maxrjones Mar 10, 2026
1282c7b
Support sequence in array functions
maxrjones Mar 10, 2026
ea89f33
Add end-to-end tests
maxrjones Mar 10, 2026
2eba460
Collapse indexing paths
maxrjones Mar 10, 2026
fa07396
Add DimensionGrid protocol
maxrjones Mar 10, 2026
a0acd95
Remove the try/except escape hatch from ChunkGrid.chunk_shape
maxrjones Mar 10, 2026
ce0527d
Cache is_regular
maxrjones Mar 10, 2026
f433668
Produce RLE directly
maxrjones Mar 10, 2026
02fd7c5
Fix bugs
maxrjones Mar 10, 2026
42ef639
Separate chunk grid serialization
maxrjones Mar 10, 2026
55e720b
Retain comments
maxrjones Mar 10, 2026
1b7871d
Update block and coordinate indexing
maxrjones Mar 10, 2026
9c0f582
POC: TiledDimension
maxrjones Mar 10, 2026
1e2fa97
Support rectilinear shards
maxrjones Mar 10, 2026
1f424b0
Revert "POC: TiledDimension"
maxrjones Mar 11, 2026
0967e53
Fix __getitem__ for 1d chunk grids
maxrjones Mar 11, 2026
67a684d
Implement resize
maxrjones Mar 11, 2026
61c48a4
Fix spec compliance
maxrjones Mar 11, 2026
7d5ebb8
Fix .info
maxrjones Mar 11, 2026
e74586a
Fix typing
maxrjones Mar 11, 2026
8ab3ca8
Adopt joe's property testing strategy
maxrjones Mar 11, 2026
80d8280
Remove RegularChunkGrid
maxrjones Mar 11, 2026
ffc7805
Use none rather than sentinel value
maxrjones Mar 11, 2026
9beaee6
Remove regular chunk grid
maxrjones Mar 11, 2026
da2c08b
Fix boundary handling in VaryingDimension
maxrjones Mar 12, 2026
6d9de38
Add chunk_sizes property
maxrjones Mar 12, 2026
cc2999a
Add docs
maxrjones Mar 12, 2026
b47ddba
Improve polymorphism
maxrjones Mar 12, 2026
44d845f
Merge branch 'main' into poc/unified-chunk-grid
maxrjones Mar 12, 2026
2caa927
always return based on inner chunks
maxrjones Mar 12, 2026
6af91a6
Fix from_array
maxrjones Mar 12, 2026
e04d864
Add V3 of the prospectus
maxrjones Mar 12, 2026
8dcea81
Fastforward design docs
maxrjones Mar 12, 2026
4eb01c5
Require array extent
maxrjones Mar 12, 2026
d893d6f
Add overflow chunk tests
maxrjones Mar 12, 2026
308bb24
Design doc for chunk grid metadata separation
maxrjones Mar 12, 2026
0f52822
minor simplifications
maxrjones Mar 12, 2026
5823fbb
Gatekeep rectilinear chunks behind feature flag
maxrjones Mar 13, 2026
27f28e7
Fix off-by-one bug
maxrjones Mar 13, 2026
a35cf56
Fix chunk indexing boundary checks
maxrjones Mar 13, 2026
cbb28fe
Standardize docstrings
maxrjones Mar 13, 2026
280eb68
fix spec compliance
maxrjones Mar 13, 2026
e88c06b
Handle integer floats
maxrjones Mar 13, 2026
58bd336
More spec compliance
maxrjones Mar 13, 2026
e0fbab4
Fix block indexing error
maxrjones Mar 13, 2026
5277739
Add V2 regression tests
maxrjones Mar 13, 2026
c9858c0
Add comments
maxrjones Mar 14, 2026
9e4fa30
Consistent bounds checking between dimension types
maxrjones Mar 14, 2026
a21d587
use pre-computed extent
maxrjones Mar 14, 2026
e062580
Improve sharding validation logic
maxrjones Mar 14, 2026
3591734
Improve sharding validation logic
maxrjones Mar 14, 2026
4be96b0
Remove deferred design
maxrjones Mar 14, 2026
087382b
Update design doc
maxrjones Mar 14, 2026
38fd5aa
Remove unnecessary casts
maxrjones Mar 14, 2026
bbc0703
Improve typing
maxrjones Mar 14, 2026
460d683
Add another deferred item
maxrjones Mar 14, 2026
73164b6
Add to design doc
maxrjones Mar 14, 2026
aec0abd
Add design principles
maxrjones Mar 14, 2026
abb9d9d
Polish design doc
maxrjones Mar 14, 2026
aa002c8
Update migration sequence
maxrjones Mar 14, 2026
fffe4da
Remove stale sections
maxrjones Mar 14, 2026
6777ec5
Use TypeGuard
maxrjones Mar 14, 2026
adec422
Cache nchunks
maxrjones Mar 14, 2026
4903b09
Add cubed example
maxrjones Mar 14, 2026
67e540c
move chunk grid off metadata (#6)
d-v-b Mar 20, 2026
14370e6
Fixup after refactor
maxrjones Mar 20, 2026
bfc5d6b
Fixup
maxrjones Mar 20, 2026
0f78339
Remove duplicated code
maxrjones Mar 21, 2026
fa6980d
Add to experimental
maxrjones Mar 21, 2026
2360392
Avoid divide by zero
maxrjones Mar 21, 2026
21aa18b
Improve RLE validation
maxrjones Mar 21, 2026
90476b8
Improve RLE validation
maxrjones Mar 21, 2026
7e171f5
Raise error on unknown chunk grid
maxrjones Mar 21, 2026
b6b271f
Add utility function
maxrjones Mar 21, 2026
c19e9db
Minor improvements
maxrjones Mar 21, 2026
764eeaf
Update shorthand
maxrjones Mar 21, 2026
54b399d
Fix zero chunks
maxrjones Mar 21, 2026
becd392
Remove extraneous validation
maxrjones Mar 21, 2026
6764ba1
Improve tests
maxrjones Mar 21, 2026
9b36448
Improve docstrings
maxrjones Mar 21, 2026
11a47ff
Update design doc
maxrjones Mar 21, 2026
4d7c724
Update docs
maxrjones Mar 21, 2026
826e030
DRY
maxrjones Mar 21, 2026
6f51e1c
Add test
maxrjones Mar 21, 2026
879f20f
Simplify
maxrjones Mar 21, 2026
edbdb5d
Consistent .chunks and .shards
maxrjones Mar 21, 2026
4a940b1
Remove separators
maxrjones Mar 21, 2026
475de21
Polish
maxrjones Mar 21, 2026
5f24ce6
Merge branch 'main' into poc/unified-chunk-grid
d-v-b Mar 26, 2026
a5715b9
Improve layout of work in progress page (#3841)
dstansby Mar 27, 2026
3a9d042
perf: oindex optimization (#3830)
maxrjones Mar 29, 2026
1c2efa6
chore: spec0 compat (python 3.14 compat, python 3.12 min) (#3564)
ilan-gold Mar 27, 2026
1e66846
fix: remove numcodecs off-spec warning (#3833)
slevang Mar 27, 2026
f2ed1c4
Improve design doc
maxrjones Mar 29, 2026
a857bac
Merge branch 'main' into poc/unified-chunk-grid
maxrjones Mar 29, 2026
7318255
Add demo notebook
maxrjones Mar 30, 2026
662ceef
Add release notes
maxrjones Mar 30, 2026
994e329
Fix indexing empty slices
maxrjones Mar 30, 2026
f1c5182
Add open support
maxrjones Mar 30, 2026
c4f7cf4
Improve release note
maxrjones Mar 30, 2026
6436db6
Create and use resolve_chunks functions
maxrjones Mar 30, 2026
26b4760
Make chunk_grid property private
maxrjones Mar 30, 2026
5a7280b
Improve docstrings
maxrjones Mar 30, 2026
60ad5cb
Update notebook
maxrjones Mar 30, 2026
f2ec718
Remove dead code
maxrjones Mar 30, 2026
2c06fb2
chore: simplify sharding codec validation against varying chunk grid …
d-v-b Mar 30, 2026
8965d09
refactor: allow regular-style chunk grid declaration for rectilinear …
d-v-b Mar 30, 2026
b3b5933
Fix typo
maxrjones Mar 30, 2026
f80f798
Normalize
maxrjones Mar 30, 2026
3327cc1
Remove shim
maxrjones Mar 30, 2026
5642e03
Consistent typing
maxrjones Mar 30, 2026
6a4c01a
Move design doc outside public docs
maxrjones Mar 30, 2026
e3ba71f
Update config.md
maxrjones Mar 30, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions changes/3802.feature.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Add support for rectilinear (variable-sized) chunk grids. This feature is experimental and
must be explicitly enabled via ``zarr.config.set({'array.rectilinear_chunks': True})``.

Rectilinear chunks can be used through:

- **Creating arrays**: Pass nested sequences (e.g., ``[[10, 20, 30], [50, 50]]``) to ``chunks``
in ``zarr.create_array``, ``zarr.from_array``, ``zarr.zeros``, ``zarr.ones``, ``zarr.full``,
``zarr.open``, and related functions, or to ``chunk_shape`` in ``zarr.create``.
- **Opening existing arrays**: Arrays stored with the ``rectilinear`` chunk grid are read
transparently via ``zarr.open`` and ``zarr.open_array``.
- **Rectilinear sharding**: Shard boundaries can be rectilinear while inner chunks remain regular.
619 changes: 619 additions & 0 deletions design/chunk-grid.md

Large diffs are not rendered by default.

165 changes: 165 additions & 0 deletions docs/user-guide/arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -599,6 +599,171 @@ In this example a shard shape of (1000, 1000) and a chunk shape of (100, 100) is
This means that `10*10` chunks are stored in each shard, and there are `10*10` shards in total.
Without the `shards` argument, there would be 10,000 chunks stored as individual files.

## Rectilinear (variable) chunk grids

!!! warning "Experimental"
Rectilinear chunk grids are an experimental feature and may change in
future releases. This feature is expected to stabilize in Zarr version 3.3.

Because the feature is still stabilizing, it is disabled by default and
must be explicitly enabled:

```python
import zarr
zarr.config.set({"array.rectilinear_chunks": True})
```

Or via the environment variable `ZARR_ARRAY__RECTILINEAR_CHUNKS=True`.

The examples below assume this config has been set.

By default, Zarr arrays use a regular chunk grid where every chunk along a
given dimension has the same size (except possibly the final boundary chunk).
Rectilinear chunk grids allow each chunk along a dimension to have a different
size. This is useful when the natural partitioning of the data is not uniform —
for example, satellite swaths of varying width, time series with irregular
intervals, or spatial tiles of different extents.

### Creating arrays with rectilinear chunks

To create an array with rectilinear chunks, pass a nested list to the `chunks`
parameter where each inner list gives the chunk sizes along one dimension:

```python exec="true" session="arrays" source="above" result="ansi"
zarr.config.set({"array.rectilinear_chunks": True})
z = zarr.create_array(
store=zarr.storage.MemoryStore(),
shape=(60, 100),
chunks=[[10, 20, 30], [50, 50]],
dtype='int32',
)
print(z.info)
```

In this example the first dimension is split into three chunks of sizes 10, 20,
and 30, while the second dimension is split into two equal chunks of size 50.

### Reading and writing data

Rectilinear arrays support the same indexing interface as regular arrays.
Reads and writes that cross chunk boundaries of different sizes are handled
automatically:

```python exec="true" session="arrays" source="above" result="ansi"
import numpy as np
data = np.arange(60 * 100, dtype='int32').reshape(60, 100)
z[:] = data
# Read a slice that spans the first two chunks (sizes 10 and 20) along axis 0
print(z[5:25, 0:5])
```

### Inspecting chunk sizes

The `.write_chunk_sizes` property returns the actual data size of each storage
chunk along every dimension. It works for both regular and rectilinear arrays
and returns a tuple of tuples (matching the dask `Array.chunks` convention).
When sharding is used, `.read_chunk_sizes` returns the inner chunk sizes instead:

```python exec="true" session="arrays" source="above" result="ansi"
print(z.write_chunk_sizes)
```

For regular arrays, this includes the boundary chunk:

```python exec="true" session="arrays" source="above" result="ansi"
z_regular = zarr.create_array(
store=zarr.storage.MemoryStore(),
shape=(100, 80),
chunks=(30, 40),
dtype='int32',
)
print(z_regular.write_chunk_sizes)
```

Note that the `.chunks` property is only available for regular chunk grids. For
rectilinear arrays, use `.write_chunk_sizes` (or `.read_chunk_sizes`) instead.

### Resizing and appending

Rectilinear arrays can be resized. When growing past the current edge sum, a
new chunk is appended covering the additional extent. When shrinking, the chunk
edges are preserved and the extent is re-bound (chunks beyond the new extent
simply become inactive):

```python exec="true" session="arrays" source="above" result="ansi"
z = zarr.create_array(
store=zarr.storage.MemoryStore(),
shape=(30,),
chunks=[[10, 20]],
dtype='float64',
)
z[:] = np.arange(30, dtype='float64')
print(f"Before resize: chunk_sizes={z.write_chunk_sizes}")
z.resize((50,))
print(f"After resize: chunk_sizes={z.write_chunk_sizes}")
```

The `append` method also works with rectilinear arrays:

```python exec="true" session="arrays" source="above" result="ansi"
z.append(np.arange(10, dtype='float64'))
print(f"After append: shape={z.shape}, chunk_sizes={z.write_chunk_sizes}")
```

### Compressors and filters

Rectilinear arrays work with all codecs — compressors, filters, and checksums.
Since each chunk may have a different size, the codec pipeline processes each
chunk independently:

```python exec="true" session="arrays" source="above" result="ansi"
z = zarr.create_array(
store=zarr.storage.MemoryStore(),
shape=(60, 100),
chunks=[[10, 20, 30], [50, 50]],
dtype='float64',
filters=[zarr.codecs.TransposeCodec(order=(1, 0))],
compressors=[zarr.codecs.BloscCodec(cname='zstd', clevel=3)],
)
z[:] = np.arange(60 * 100, dtype='float64').reshape(60, 100)
np.testing.assert_array_equal(z[:], np.arange(60 * 100, dtype='float64').reshape(60, 100))
print("Roundtrip OK")
```

### Rectilinear shard boundaries

Rectilinear chunk grids can also be used for shard boundaries when combined
with sharding. In this case, the outer grid (shards) is rectilinear while the
inner chunks remain regular. Each shard dimension must be divisible by the
corresponding inner chunk size:

```python exec="true" session="arrays" source="above" result="ansi"
z = zarr.create_array(
store=zarr.storage.MemoryStore(),
shape=(120, 100),
chunks=(10, 10),
shards=[[60, 40, 20], [50, 50]],
dtype='int32',
)
z[:] = np.arange(120 * 100, dtype='int32').reshape(120, 100)
print(z[50:70, 40:60])
```

Note that rectilinear inner chunks with sharding are not supported — only the
shard boundaries can be rectilinear.

### Metadata format

Rectilinear chunk grid metadata uses run-length encoding (RLE) for compact
serialization. When reading metadata, both bare integers and `[value, count]`
pairs are accepted:

- `[10, 20, 30]` — three chunks with explicit sizes
- `[[10, 3]]` — three chunks of size 10 (RLE shorthand)
- `[[10, 3], 5]` — three chunks of size 10, then one chunk of size 5

When writing, Zarr automatically compresses repeated values into RLE format.

## Missing features in 3.0

The following features have not been ported to 3.0 yet.
Expand Down
1 change: 1 addition & 0 deletions docs/user-guide/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Configuration options include the following:
- Default Zarr format `default_zarr_version`
- Default array order in memory `array.order`
- Whether empty chunks are written to storage `array.write_empty_chunks`
- Enable experimental rectilinear chunks `array.rectilinear_chunks`
- Async and threading options, e.g. `async.concurrency` and `threading.max_workers`
- Selections of implementations of codecs, codec pipelines and buffers
- Enabling GPU support with `zarr.config.enable_gpu()`. See GPU support for more.
Expand Down
Loading