fix: Correct REST JSON serialization of UUID/HUGEINT/BLOB/BIT (#89)#92
Merged
Conversation
Type-coverage audit follow-up. Several scalar types were mis-routed in the REST serializer: - UUID -> read via the VARCHAR path (as a string pointer) but is physically a 128-bit int, causing a SEGFAULT on any UUID column. Now formatted as the canonical 8-4-4-4-12 hex string. - HUGEINT/UHUGEINT -> read as 64-bit ints, truncating values and mis-striding multi-row chunks (e.g. MAX::HUGEINT -> -1). Now emitted as exact decimal strings (lossless; JSON numbers lose precision above 2^53). - BLOB -> emitted raw bytes (invalid UTF-8 / invalid JSON). Now uses DuckDB's blob string form (printable as-is, others as \xNN). - BIT -> emitted raw bit storage as garbage. Now emitted as its 0/1 string. All honor row validity (NULL -> null). Adds [scalar_types] regression tests. VARINT/BIGNUM, GEOMETRY and VARIANT remain serialized as null (internal/extension encodings not safely convertible at the vector level) — documented as a known limitation. Found via type audit + codex review.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #90 / #91. A full audit of every DuckDB type against the REST JSON serializer turned up more mis-routed scalar types — including a crash:
UUID8-4-4-4-12hex stringHUGEINT/UHUGEINTMAX::HUGEINT→-1)BLOB\xNNfor non-printable)BIT"101010"string128-bit integers are emitted as strings (per maintainer decision) so values beyond 2⁶³ survive — JSON numbers lose precision above 2⁵³.
Verified correct (no change needed)
All standard scalars,
DATE/TIME/TIMESTAMP(+variants),DECIMAL,ENUM,INTERVAL,JSON, and the nested typesLIST/STRUCT/ARRAY/UNION/MAPincl. deeply nested combinations.Known limitation
VARINT/BIGNUM,GEOMETRY, andVARIANTstill serialize asnull— their internal/extension encodings aren't safely convertible at the vector level. Documented in code;to_json(col)remains a workaround.Test plan
[query_executor][scalar_types]regression tests: UUID (single + multi-row, no crash), HUGEINT/UHUGEINT exact decimals, BLOB, BIT, and NULLs.Relates to #89