Skip to content

feat - Migrate to Transformers 5.5.4#9248

Open
kappacommit wants to merge 12 commits into
invoke-ai:mainfrom
kappacommit:feat/transformers-5.9-compel-fork
Open

feat - Migrate to Transformers 5.5.4#9248
kappacommit wants to merge 12 commits into
invoke-ai:mainfrom
kappacommit:feat/transformers-5.9-compel-fork

Conversation

@kappacommit
Copy link
Copy Markdown
Contributor

@kappacommit kappacommit commented May 29, 2026

Summary

Migrates to transformers 5.5.4

I tried to migrate to the latest 5.9.0 but transformers introduced a breakage with 5.6.0 that breaks loading of SD1.5 checkpoints. I was not able to resolve this breakage. I created an issue against Diffusers here, I believe this needs resolved before we can progress past 5.5.4 huggingface/diffusers#13833

Related Issues / Discussions

QA Instructions

Testing guide:

P1 — Prompt weighting (Compel fork) — highest risk

Changed: compel==2.1.1 (PyPI) → invoke-ai/compel@main fork (2.3.1), the
only compel build that supports transformers 5.x. This is the largest single
behavioural change in the migration.

# Test Steps Pass criteria
W1 SD 1.5 weighted prompt SD 1.5 model. Prompt a (red:1.5) car on a (blue:0.5) road, 512×512, seed 42. Generates without error; visibly more red than blue vs an unweighted run
W2 SD 1.5 negative prompt Prompt a photo of a dog, negative blurry, low quality. Generates without crash
W3 SDXL weighted prompt SDXL model, same weighted prompt as W1. Generates; weights applied
W4 Long prompt (>77 tokens) SD 1.5, a prompt well over 77 tokens. Compel concatenates chunks; generates without tokenizer/length error

P2 — Z-Image text encoder (rope_theta fix) — new in this migration

Changed: invokeai/backend/model_manager/load/model_loaders/z_image.py.
transformers 5.x removed the top-level Qwen3Config.rope_theta attribute (the
value now lives in the rope_parameters dict). The loader reconstructs the
rotary-embedding inv_freq buffer from that base, so a wrong/missing value
silently corrupts positional encoding → garbled images. Covers both the
safetensors and GGUF Qwen3 encoder paths.

# Test Steps Pass criteria
Z1 Z-Image (safetensors) Load a Z-Image model (safetensors text encoder). Prompt a serene mountain lake at dawn. Generates a coherent image — no AttributeError, no noise/garbled output
Z2 Z-Image (GGUF encoder) Load a Z-Image model with a GGUF Qwen3 text encoder. Same prompt. Generates a coherent image; console shows no "Re-initializing unknown meta buffer" warning for inv_freq
Z3 Determinism check Run Z1 twice with the same seed. Identical output both runs (confirms rope base is stable, not defaulted)

P3 — HuggingFace token status (whoami) — new in this migration

Changed: invokeai/app/api/routers/model_manager.py. huggingface_hub 1.x
removed get_token_permission(); HFTokenHelper.get_status() now validates via
whoami() — VALID on success, INVALID on HfHubHTTPError (e.g. 401), UNKNOWN on
other errors (e.g. no network). CPU-side; runs on any machine.

  • Remove your HF token
  • Attempt to install a model via the model manager that is gated behind approval (eg: SD 3.5)
  • Verify failure message tells you that you need a token
  • Add an HF token
  • Attempt to install the same model
  • Verify success

P4 — T5 / CLIP text encoders (tokenizer + weight-tying changes)

Changed (5.1.0 migration, re-validated on 5.9): flux_text_encoder.py,
sd3_text_encoder.py, T5 weight sharing now via tie_weights() rather than an
identity assertion.

# Test Steps Pass criteria
T1 FLUX text-to-image FLUX schnell (fp8/GGUF for 8 GB). Prompt a lighthouse on a cliff at sunset. Generates without AssertionError or tokenizer error
T2 FLUX long prompt Same model, prompt > 500 words. Generates; truncation is a warning, not a crash
T3 SD3 text-to-image SD3 medium (if VRAM allows). Simple prompt. Generates without tokenizer error
T4 FLUX bnb-nf4 (quantized) Load a FLUX bnb-nf4 checkpoint, generate. No AssertionError in _load_state_dict_into_t5(); generates

P5 — Safety checker & CLIP image processor (diffusers pipeline)

Changed (5.1.0 migration): safety_checker.py, diffusers_pipeline.py
(feature-extractor → image-processor APIs).

# Test Steps Pass criteria
S1 Safety checker first download Delete <invoke_root>/models/core/convert/stable-diffusion-safety-checker/ if present. Enable NSFW checker. Generate SD 1.5 image. Checker downloads; image generates (pass or blurred); no serialization error
S2 Safety checker cached load Generate another SD 1.5 image. Loads from disk; no re-download
C1 SD 1.5 baseline Load SD 1.5, generate any image. Generates without Import/AttributeError from the pipeline

Regression sweep

After the targeted tests, a broad smoke pass over the app:

  • Generate on each model family you have installed (SD 1.5, SDXL, FLUX, SD3, Z-Image).
  • Exercise one ControlNet and one IP-Adapter run if available (both import transformers vision models).
  • Watch the console for any ImportError, AttributeError, or AssertionError — none expected.

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

Your Name and others added 11 commits February 6, 2026 19:58
Switches compel from PyPI 2.1.1 to invoke-ai/compel@main fork which supports
transformers 5.x. Bumps transformers floor to 5.9.0. Removes the
transformers>=5.1.0 uv override that was only needed to bypass compel 2.1.1's
<5.0 constraint.

NOTE: compel fork pulls notebook dep (full Jupyter stack); flag to maintainer for cleanup.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s 5.x

transformers 5.x no longer exposes rope_theta as a top-level attribute on
Qwen3Config; the value is stored in the rope_parameters (and rope_scaling)
dict instead. Read it from there with a getattr fallback so the inv_freq
buffer is computed from the configured base (1e6 / 256) instead of raising
AttributeError. Applies to both the safetensors and GGUF Qwen3 encoder paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…whoami

huggingface_hub 1.x removed get_token_permission(). HFTokenHelper.get_status()
now validates the token via whoami(), which returns user info for a valid token
and raises HfHubHTTPError for an invalid one. Preserves the original three-way
status: VALID on success, INVALID on HfHubHTTPError (e.g. 401), UNKNOWN on any
other error (e.g. network failure).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
….9-compel-fork

# Conflicts:
#	invokeai/app/api/routers/model_manager.py
#	invokeai/app/invocations/sd3_text_encoder.py
#	invokeai/backend/model_manager/metadata/fetch/huggingface.py
#	pyproject.toml
#	uv.lock
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The upstream merge left an unresolved conflict marker in _t5_encode and
reintroduced T5TokenizerFast. Keep our v5 assertion (T5Tokenizer only) plus
upstream's new t5_device logic, and drop the now-dead T5TokenizerFast
monkeypatch in the test (the name no longer exists in the module).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- flux_text_encoder.py: drop unused typing.Union (F401) left by v5 import merge
- huggingface.py: ruff format (wrap append(SimpleNamespace(...)))

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
transformers 5.6 flattened CLIPTextModel (removed the self.text_model wrapper,
hoisted embeddings/encoder/final_layer_norm to the top level). diffusers' single-file
checkpoint loader (create_diffusers_clip_model_from_ldm) still assumes the nested
layout, so loading SD1.5 .safetensors checkpoints fails on 5.6+ with
'CLIPTextModel object has no attribute text_model' and, once that read is shimmed,
'Cannot copy out of meta tensor' (weights never populate the flattened model).

Pin to >=5.5,<5.6 (last pre-flattening release) which keeps both the single-file
and from_pretrained paths working. The invoke-ai/compel fork accepts any 5.x.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions github-actions Bot added python PRs that change python files Root invocations PRs that change invocations backend PRs that change backend files python-tests PRs that change python tests python-deps PRs that change python dependencies labels May 29, 2026
@kappacommit kappacommit mentioned this pull request May 29, 2026
5 tasks
Comment thread uv.lock
@
chore(deps): replace compel fork with official compel 2.4.0

compel 2.4.0 (released 2026-05-30) merges the transformers-5 support that
the invoke-ai fork carried (both descend from upstream PR invoke-ai#129), plus the
maintainer-reviewed padding rework and added diffusers/T5 smoke coverage.
Switch from the git fork to the PyPI release.

- pyproject: compel git+main -> compel>=2.4.0,<3
- uv.lock: compel 2.3.1 (git 8f404b45) -> 2.4.0 (pypi)
- transformers stays 5.5.4 (satisfies compel >=5,<6 and our <5.6 pin)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@
@kappacommit
Copy link
Copy Markdown
Contributor Author

Updated to use official compel 2.4 instead of our fork.

Also the issue I opened with diffusers looks like it is being addressed pretty quickly, we may be able to get to transformers 5.9.0 soon. huggingface/diffusers#13843

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend PRs that change backend files invocations PRs that change invocations python PRs that change python files python-deps PRs that change python dependencies python-tests PRs that change python tests Root

Projects

Status: 6.13.5 LIBRARY UPDATES

Development

Successfully merging this pull request may close these issues.

4 participants