Skip to content

Configurable chunk writing mode #3848

@d-v-b

Description

@d-v-b

When writing chunks, we check for existing chunks and reconcile their contents with the chunks to be written. When resizing an array to reduce its shape, the old chunks can be left behind. If you then resize the array to increase its shape, the array will "see" the old chunks and read them. For the regular chunk grid this is safe, but for the rectilinear chunk grid, it may eventually become possible to resize an array with chunks that are incompatible with a future version of that same array. This would require allowing our array resizing routines to take a chunks parameter to specify the new chunks.

In this case, reading old chunks (with incompatible chunks) may error in the bytes codec when we try to reshape the decoded bytes into the output chunk shape while writing a partial chunk. IMO we may want to expose a configuration option that clobbers existing chunks when performing partial writes. This could be more finessed (only clobber chunks when there's a resizing error) or coarse (writing any region of a chunk always clobbers, no matter what).

I'm thinking we would have a new field in array configuration like chunk_write_mode: "merge" | "overwrite". that would let us express today's behavior ("merge") and the coarse version of the proposal in this issue (always clobber).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions