Skip to content

fix(JSONValidator): make JSONValidator thread-safe for parallel spec validation#95

Merged
rederik76 merged 8 commits into
mainfrom
fix/json-validator-thread-safe-ref-resolver
Jun 1, 2026
Merged

fix(JSONValidator): make JSONValidator thread-safe for parallel spec validation#95
rederik76 merged 8 commits into
mainfrom
fix/json-validator-thread-safe-ref-resolver

Conversation

@rederik76
Copy link
Copy Markdown
Collaborator

Use a per-thread RefResolver and Draft7Validator so concurrent validate() calls from DataflowSpecBuilder and SpecMapper do not corrupt jsonschema ref resolution (e.g. KeyError on definitions/views).
Reuse the resolver on each thread to keep referenced schema files cached after the first load on that worker.

rederik76 and others added 8 commits May 20, 2026 14:22
…t directory enumeration in historical CDC snapshot

- _list_files now tries dbutils.fs.ls() first; falls back to Spark
  binaryFile on any exception (e.g. Py4JSecurityException in Serverless
  with Restricted Access / SEG)
- Fix bug where dbutils.fs().ls() was called with parentheses on fs
- binaryFile fallback stops at .parquet directories and deduplicates
  part files so each snapshot version is counted once
- dbutils path also guards against recursing into .parquet directories
  (trailing "/" stripped before the endswith check)
Use a per-thread RefResolver and Draft7Validator so concurrent
validate() calls from DataflowSpecBuilder and SpecMapper do not
corrupt jsonschema ref resolution (e.g. KeyError on definitions/views).

Reuse the resolver on each thread to keep referenced schema files
cached after the first load on that worker.
@rederik76 rederik76 requested a review from haillew June 1, 2026 01:30
@rederik76 rederik76 merged commit eb7e96e into main Jun 1, 2026
@rederik76 rederik76 deleted the fix/json-validator-thread-safe-ref-resolver branch June 1, 2026 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants