feat: auto-load HF ONNX artifacts on CPU by aidamian · Pull Request #402 · Ratio1/edge_node

aidamian · 2026-05-09T06:22:28Z

No description provided.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d7cba8f217

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

What changed: - Make auto ONNX startup opportunistic and fall back to Transformers/PT on ONNX init or warmup failure. - Keep explicit ONNX runtimes fail-fast while explicit PT skips manifest lookup. - Gate decoder and tokenizer remote code on global and runtime trust flags. - Confine manifest-declared artifact paths to the downloaded HF snapshot and filter broad/framework-weight allow patterns. - Forward runtime metadata consistently for privacy-filter responses and add focused regression coverage. Why: - Preserve seamless CPU ONNX when available without breaking Transformers fallback or weakening remote-code/path safety.

What changed: - Require selected ONNX runtime config trust_remote_code=True before executing artifact decoder or tokenizer remote code. - Add regression coverage proving a top-level manifest trust flag cannot enable runtime code execution by itself. Why: - Avoid remote-code trust bypasses from broad manifest metadata; the selected runtime must explicitly opt in.

What changed: - Added subclass ONNX fallback hooks in the HF serving base. - Added local privacy-filter ONNX discovery and BIOES/Viterbi span decoding. - Covered fallback runtime selection and privacy-filter decoder behavior with tests. Why: - Allow openai/privacy-filter ONNX artifacts to run without a remote artifact manifest or remote Python decoder code.

What changed: - Keep HF artifact path traversal checks lexical so valid snapshot symlinks into the cache blob store are accepted. - Merge exact manifest files with recommended ONNX allow patterns after filtering broad or framework-weight downloads. - Add regression coverage for both behaviors. Why: - Live PR image validation showed Sentinel and privacy-filter ONNX startup falling back because valid HF snapshot files were rejected as escaping the snapshot.

What changed: - Temporarily allow ONNX artifact decoders without runtime-level trust_remote_code to inherit global TRUST_REMOTE_CODE=True. - Keep explicit runtime trust_remote_code=False as a hard block. - Add a TODO documenting the security concern and declarative decoder replacement path. Why: - The current Sentinel ONNX artifact predates runtime-level trust metadata and uses a reviewed contract decoder, so it needs a compatibility path until the artifact moves to declarative decoding.

What changed: - Split ONNX remote-code trust between tokenizer/model loading and decoder execution. - Keep tokenizer/model loading tied to runtime-level trust_remote_code. - Temporarily allow Python decoder execution when global TRUST_REMOTE_CODE=True, even for legacy runtimes that mark ONNX trust_remote_code=False. Why: - Current Sentinel ONNX artifacts use trust_remote_code=False for tokenizer/model loading but still declare a Python contract decoder. This keeps the temporary compatibility path narrow until declarative decoding replaces it.

What changed: - Prepare HF ONNX artifacts in an edge-node-owned materialized cache before creating ONNX Runtime sessions. - Hardlink resolved HF cache blobs when possible and copy as fallback. - Preserve runtime relative layout for .onnx and external data sidecars. - Add regression coverage for symlinked external data files. Why: - ONNX Runtime rejects HF snapshot symlinks for external data because resolved sidecars can escape the model directory.

feat: auto-load HF ONNX artifacts on CPU

d7cba8f

aidamian requested a review from cristibleotiu May 9, 2026 06:22

chatgpt-codex-connector Bot reviewed May 9, 2026

View reviewed changes

Comment thread extensions/serving/default_inference/nlp/th_hf_model_base.py Outdated

Comment thread extensions/serving/default_inference/nlp/th_hf_model_base.py

cristibleotiu and others added 7 commits May 11, 2026 12:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: auto-load HF ONNX artifacts on CPU#402

feat: auto-load HF ONNX artifacts on CPU#402
aidamian wants to merge 8 commits into
developfrom
onnx-hf-serving

aidamian commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aidamian commented May 9, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants