Allow snapshot-id in assert-ref-snapshot-id requirement to serialize to null in json#2343
Merged
kevinjqliu merged 5 commits intoapache:mainfrom Aug 19, 2025
Merged
Conversation
kevinjqliu
approved these changes
Aug 19, 2025
Contributor
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM! Thanks for the PR! This makes sense.
pyiceberg/table/update/__init__.py
Outdated
Comment on lines
759
to
765
| # override the override method, allowing None to serialize to `null` instead of being omitted. | ||
| def model_dump_json( | ||
| self, exclude_none: bool = False, exclude: Optional[Set[str]] = None, by_alias: bool = True, **kwargs: Any | ||
| ) -> str: | ||
| return super().model_dump_json( | ||
| exclude_none=exclude_none, exclude=self._exclude_private_properties(exclude), by_alias=by_alias, **kwargs | ||
| ) |
Contributor
There was a problem hiding this comment.
nit: i think we can simply this to
def model_dump_json(self) -> str:
# `snapshot-id` is required in json response, even if null
return super().model_dump_json(exclude_none=False)
Contributor
|
looks like linter errored too, |
Fokko
requested changes
Aug 19, 2025
Contributor
Fokko
left a comment
There was a problem hiding this comment.
Isn't the spec off here?
AssertRefSnapshotId:
description:
The table branch or tag identified by the requirement's `ref` must reference the requirement's `snapshot-id`;
if `snapshot-id` is `null` or missing, the ref must not already exist
allOf:
- $ref: '#/components/schemas/TableRequirement'
required:
- ref
- snapshot-id
properties:
type:
type: string
const: "assert-ref-snapshot-id"
ref:
type: string
snapshot-id:
type: integer
format: int64When I read the description, the snapshot-id might be null or missing.
Contributor
|
yea i read the same, but its marked as required so it can be null/missing but needs to exist in |
Contributor
|
Ah, like that. Yes, I agree. Still the description of the spec could use an update, since it cannot be missing 🤣 |
Fokko
approved these changes
Aug 19, 2025
Fokko
reviewed
Aug 19, 2025
Contributor
Contributor
Author
|
@Fokko @kevinjqliu thanks for the fast review and merge! |
This was referenced Aug 19, 2025
This was referenced Aug 22, 2025
kevinjqliu
added a commit
that referenced
this pull request
Aug 25, 2025
Closes #2342 # Rationale for this change The last attempt at fixing this (#2343) did not actually end up serializing None values to `null`. This was b/c the `AssertRefSnapshotId` requirement was often a child object of some other `Request` object, which was actually getting the `model_dump_json()` call. This meant that the `exclude_none=True` override would remove any `None`s _before_ any of the `Field`-specific configurations would be considered (like `exclude=False` or the model's own `model_dump_json` method). I then investigated setting some `ConfigDict` settings on the model, but that also did not have any useful configurations. I looked into setting `exclude_none=True` at all of the `model_json_dump()` callsites but there were too many and it would be easy to forget to set that value every time. The issue is that many values are _supposed to be absent_ to instruct the REST Catalog about what the client wants (things like filtering, for instance). ~The solution I ended up with was allowing `snapshot_id` to be a required `int` with a default value of `-1`, and adding a json-specific field serializer to write out `null` when dumping to json. This seems to work~ The solution that worked and allowed for `snapshot_id` to be set to `None` (as it's done in Ray Data), was to define a `model_serializer` method and always return the fields. This was based on conversations in pydantic/pydantic#7326. ## Are these changes tested? Yes ## Are there any user-facing changes? Actually allow `snapshot-id` to be null for `assert-ref-snapshot-id` requirements --------- Co-authored-by: Kevin Liu <kevin.jq.liu@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2342
Rationale for this change
The OpenAPI spec specifies that
snapshot-idis required 1, even if it'snull. The current code omits the field b/cexclude_none=Trueis set for all models that extendIcebergBaseModel. Settingexclude=Falseon the field doesn't override the behavior and thus the model needs to control it's own serialization. This is the only model I know of so far that has this "required null" problem so I held back from making another class for this model to extend.Are these changes tested?
Yes
Are there any user-facing changes?
Set
snapshot-id: nullwhen committing tables that don't have a current snapshot-id.