Add support for structured dtypes to zarr3 driver, open structs as void#271
Add support for structured dtypes to zarr3 driver, open structs as void#271BrianMichell wants to merge 123 commits intogoogle:masterfrom
zarr3 driver, open structs as void#271Conversation
…t for raw bits dtype
Implement shim for `open_as_void` driver level flag
* Begin removing void field shim * Fully removed void string shim * Cleanup debug prints * Remove shimmed validation * Remove unnecessary comment * Prefer false over zero for ternary clarity
* Implement a more general and portable example set * Fix driver cache bug * Update example for template * Cleanup example * Remove testing examples from source
* Use the appropriate fill value for open_as_void structured data * Cleanup
|
I'll try to get to this in about a week, before I look this one over, please double check that the prior PR works for you. Also look over this one and see if any of the suggestions from the other one applies. |
Matches the pattern from zarr v2 driver (PR google#272). When both "field" and "open_as_void" are specified in the spec, return an error since these options are mutually exclusive - field selects a specific field from a structured array, while open_as_void provides raw byte access to the entire structure.
The zarr3 URL syntax cannot represent field selection or void access mode. Following the pattern from zarr v2 driver (PR google#272), ToUrl() now returns an error when either of these options is specified instead of silently ignoring them.
…trip Following the pattern from zarr v2 driver (PR google#272), override GetBoundSpecData in ZarrDataCache to set spec.open_as_void from ChunkCacheImpl::open_as_void_. This ensures that when you open a store with open_as_void=true and then call spec(), the resulting spec correctly has open_as_void=true set. Without this fix, opening a store with open_as_void=true and then getting its spec would lose the open_as_void flag, causing incorrect behavior if the spec is used to re-open the store.
|
Ok, I think that I'm running out of things to point out. We may want to wait for @jbms to look at it again and see what he sees; he may be out for a week or so. |
|
Ok, so back to open_as_void. Now that you have a r* data type, open_as_void is basically replacing the bytes codec with one where the new bytes codec is just the r. What happens if the implementation basically becomes that? (remember I asked about open_as_void separation way back when...) I think that if open_as_void repaces the "bytes" codec with an equivalent using the "r" data type, the chunk cache no longer cares about open_as_void. |
| // For structured types (byte dtype with extra dimension), user-specified shapes | ||
| // may not include the bytes dimension. Shapes [4,4] and [4,4,3] are compatible | ||
| // if the first N elements match. | ||
| struct TryMergeSubChunkShape { |
There was a problem hiding this comment.
Hm, why do we need this here? The codec should have the correct shape?
|
In general I think this should be implemented along the lines of: (1) Supporting an extra "field shape" similar to zarr v2, which is needed to support "raw" data types independent of open_as_void. (2) Resolving the codec pipeline in a special way when open_as_void is specified. The chunk cache, and other parts that are used after resolving the codec pipeline, need to be aware of |
Supersedes #264
Resolves comments 1 and 2