Update to transformers 5.1.0#8859
Conversation
|
After further review of the compel repo, I do not feel comfortable merging this until they can add official support for Transformers 5+ I see several changes necessary in their code to fully support 5.1, and forcibly overriding the dependency to 5.1 as I've done in this PR may cause any number of unknown issues. While I did evaluate that prompt weighting itself functioned correctly and without errors (this is what Compel is used for), there is unknown risk to forcibly upgrading it to Transformers 5.1 Recommend not merging this PR until they can formally support Transformers 5+. I have opened an issue on their repo and will monitor it periodically to see if support is added. |
| # This prevents conflicts with opencv-contrib-python, which Invoke requires. | ||
| override-dependencies = ["opencv-python; sys_platform=='never'"] | ||
| # Force transformers>=5.1.0 past compel==2.1.1's ~=4.25 (<5.0) constraint. | ||
| override-dependencies = ["opencv-python; sys_platform=='never'", "transformers>=5.1.0"] |
There was a problem hiding this comment.
The path forward will be to undo this change once Compel formally supports Transformers 5+
Until then, while I did not face any issues throghout testing this, I do not recommend merging this PR
| def get_status(cls) -> HFTokenStatus: | ||
| try: | ||
| if huggingface_hub.get_token_permission(huggingface_hub.get_token()): | ||
| token = huggingface_hub.get_token() |
There was a problem hiding this comment.
Why was this changed? Answer: During testing I found that the new version of HF Hub now throws an exception if you pass a None token to get_token_permissions. This did not happen in the old version. As a result of this, Invoke would display the unhelpful 'An unknown error has occurred when retrieving the model from Huggingface' message rather than the very helpful 'You probably need to request permission to access this model' message.
After this change, the correct messages display as expected and the user is prompted to verify their access to the model.
|
|
||
| import torch | ||
| from transformers import CLIPTextModel, CLIPTokenizer, T5EncoderModel, T5Tokenizer, T5TokenizerFast | ||
| from transformers import CLIPTextModel, CLIPTokenizer, T5EncoderModel, T5Tokenizer |
There was a problem hiding this comment.
Why was this changed? Answer: T5TokenizerFast has been removed in Transformers 5.
| context.util.signal_progress("Running T5 encoder") | ||
| assert isinstance(t5_text_encoder, T5EncoderModel) | ||
| assert isinstance(t5_tokenizer, (T5Tokenizer, T5TokenizerFast)) | ||
| assert isinstance(t5_tokenizer, T5Tokenizer) |
There was a problem hiding this comment.
Why was this changed? Answer: T5TokenizerFast has been removed in Transformers 5.
| # Add user's cached access token to HuggingFace requests | ||
| if source.access_token is None: | ||
| source.access_token = HfFolder.get_token() | ||
| source.access_token = hf_get_token() |
There was a problem hiding this comment.
Why was this changed? Answer: Updating transformers also mandated updating the huggingface_hub dependency. This is the new way to access the get_token function.
| from diffusers.pipelines.stable_diffusion.safety_checker import StableDiffusionSafetyChecker | ||
| from PIL import Image, ImageFilter | ||
| from transformers import AutoFeatureExtractor | ||
| from transformers import AutoImageProcessor |
There was a problem hiding this comment.
Why was this changed? Answer: AutoFeatureExtractor has been replaced with AutoImageProcesser in Transformers 5
| cls.feature_extractor.save_pretrained(model_path) | ||
| cls.safety_checker = StableDiffusionSafetyChecker.from_pretrained(repo_id) | ||
| cls.safety_checker.save_pretrained(model_path, safe_serialization=True) | ||
| cls.safety_checker.save_pretrained(model_path) |
There was a problem hiding this comment.
Why was this changed? Answer: safe_serialization is no longer an optional parameter in Transformers 5, it is always enabled.
| CLIPTokenizer, | ||
| T5EncoderModel, | ||
| T5TokenizerFast, | ||
| T5Tokenizer, |
There was a problem hiding this comment.
Why was this changed? Answer: T5TokenizerFast has been removed in Transformers 5.
| # parameter level (via _tie_weights / tie_weights) rather than as a Python object | ||
| # alias. load_state_dict(assign=True) replaces parameters in-place, which severs | ||
| # the parameter-level tie. Calling tie_weights() re-establishes it. | ||
| model.tie_weights() |
There was a problem hiding this comment.
Why was this changed? Answer: See comment
| from diffusers.pipelines.pipeline_utils import DiffusionPipeline | ||
| from diffusers.schedulers.scheduling_utils import SchedulerMixin | ||
| from transformers import CLIPTokenizer, PreTrainedTokenizerBase, T5Tokenizer, T5TokenizerFast | ||
| from transformers import CLIPTokenizer, PreTrainedTokenizerBase, T5Tokenizer |
There was a problem hiding this comment.
Why was this changed? Answer: T5TokenizerFast has been removed in Transformers 5.
| """ | ||
| self._requests = session or requests.Session() | ||
| configure_http_backend(backend_factory=lambda: self._requests) | ||
| self._has_custom_session = session is not None |
There was a problem hiding this comment.
Why was this changed? Answer: Invoke's unit test suite depended on the function configure_http_backend to "mock" callouts inside the test. This is how Invoke ensures that we are not making real callouts to real services inside of the test.
configure_http_backend was removed from the huggingface_hub package entirely. So, we need to re-implement this functionality a different way.
| metadata = HuggingFaceMetadata.model_validate_json(json) | ||
| return metadata | ||
|
|
||
| def _model_info_via_session(self, repo_id: str, variant: Optional[ModelRepoVariant] = None) -> SimpleNamespace: |
There was a problem hiding this comment.
This is how we implement the functionality which mocks the session. This code only ever executes in unit tests, and only executes when attempting to make a callout to the real internet. We return this mocked response instead. This mimics what we were already doing with configure_http_backend
| full-precision model is returned. | ||
| """ | ||
| session = session or Session() | ||
| configure_http_backend(backend_factory=lambda: session) # used in testing |
There was a problem hiding this comment.
Why was this changed? Answer: Since we've implement the mocking directly in the hugginface module, it no longer needs to be manually mocked elsewhere.
| # parameter level (via _tie_weights / tie_weights) rather than as a Python object | ||
| # alias. load_state_dict(assign=True) replaces parameters in-place, which severs | ||
| # the parameter-level tie. Calling tie_weights() re-establishes it. | ||
| model.tie_weights() |
There was a problem hiding this comment.
Why was this changed? Answer: See comment
| from diffusers.utils.import_utils import is_xformers_available | ||
| from pydantic import Field | ||
| from transformers import CLIPFeatureExtractor, CLIPTextModel, CLIPTokenizer | ||
| from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer |
There was a problem hiding this comment.
Why was this changed? Answer: CLIPFeatureExtractor has been replaced with ClipImageProcessor in Transformers 5
|
Thanks for your patience and for all the work that went into this. The next release will focus on updating Transformers and other out of date libraries. A couple of questions:
|
There is a PR but it has sat for a while. It looks like it has had some movement recently though. Lets hope it gets merged damian0815/compel#129 Regarding the merge conflicts, this PR is now severely out of date, in particular because of all of the new models added in 6.13, these all need to be updated. I will update this once we have a path forward on Compel. Until then, it is a moving target, as we're likely to keep adding dependencies on the diffusers 4 concepts until we can cut over to diffusers 5. |
|
Hi, I've been unable to reach Damian regarding progress towards updating compel to use Transformers 5.1.0. So for the time being I have forked his repository into the The enclosed patch file will update |
|
New PR open here with compel patch applied: #9248 Closing this one |
Summary
This updates to Transformers 5.1.0. Transformers 5.0 adds support for the SAM3 model. My end goal is to add support for SAM3 in Invoke.
I have performed this update with AI assistance, I am certainly out of my depth with evaluating the true impact of these changes. However, I have produced a test plan with the help of AI and validated every scenario in the test plan. See the test plan at the bottom of this PR, which details every change and the reasoning for it.
NOTE: There is one dependency of InvokeAI which does NOT support Transformers 5.0+ -> The Compel project, which is used for prompt weighting: https://github.com/damian0815/compel
I have opened an issue on their repo here to bump their version: damian0815/compel#128
For now, I have added Transformers 5.1.0 to
override-dependencies- And faced no issues after doing so. I have evaluated prompt weighting (which can be hard to validate by eye) and it appears to still work properly within my tests.The most substantial code in this PR is the update to how the huggingface session is handled. My understanding, is that this new code only applies to test cases due to deprecation of a module which Invoke was using to produce mock callouts in the test environment, and that it has no effect on production functionality.
Related Issues / Discussions
QA Instructions
I have followed the testing plan at the bottom of this document. The testing plan was produced with AI assistance.
Merge Plan
Checklist
What's Newcopy (if doing a release after this PR)Change Description & Test Plan (Generated with AI Assistance)
Regression Testing Plan: Transformers 5.1.0 + HuggingFace Hub Migration
Below is a change-by-change plan. Each section explains what changed, why, and how to test it. Tests are ordered from quickest smoke tests to longer end-to-end runs.
Change 1:
CLIPFeatureExtractor→CLIPImageProcessorFile:
invokeai/backend/stable_diffusion/diffusers_pipeline.pyWhy:
CLIPFeatureExtractorwas removed in transformers 5.x in favour ofCLIPImageProcessor.python -c "from invokeai.backend.stable_diffusion.diffusers_pipeline import StableDiffusionGeneratorPipeline"— should not raiseImportError"a cat on a couch"). Confirm the image generates without errors. This exercises the fullStableDiffusionGeneratorPipelineincluding the feature extractor type.Change 2:
AutoFeatureExtractor→AutoImageProcessor+ removedsafe_serializationFile:
invokeai/backend/image_util/safety_checker.pyWhy: All vision
FeatureExtractorclasses were removed in transformers 5.x. Thesafe_serializationparameter was also removed fromsave_pretrained(safetensors is now the only format).python -c "from invokeai.backend.image_util.safety_checker import SafetyChecker"<root>/models/core/convert/stable-diffusion-safety-checker/if it exists. Enable the NSFW checker in config (nsfw_checker: true). Generate any SD 1.5 image. Confirm the safety checker downloads, saves to disk (nosafe_serializationerror), and the image either passes or is correctly blurred.AutoImageProcessor.from_pretrained()without re-downloading.Change 3:
T5TokenizerFast→T5Tokenizer(4 files)Files:
invokeai/app/invocations/flux_text_encoder.pyinvokeai/app/invocations/sd3_text_encoder.pyinvokeai/backend/model_manager/load/model_util.pyinvokeai/backend/model_manager/load/model_loaders/flux.pyWhy: Transformers 5.x unified slow/fast tokenizers —
T5TokenizerFastno longer exists as a separate class.T5Tokenizernow uses the Rust backend by default.python -c "from invokeai.app.invocations.flux_text_encoder import FluxTextEncoderInvocation", same forsd3_text_encoder,model_util, and the flux loader module."a lighthouse on a cliff at sunset". This exercisesT5Tokenizer.from_pretrained()in the loader and theisinstance(t5_tokenizer, T5Tokenizer)assertion in the text encoder invocation.batch_decodefor truncation warnings.isinstance(model, T5Tokenizer)path inmodel_util.py). No crash = pass.Change 4:
configure_http_backendremoved + session-aware metadata fetchingFiles:
invokeai/backend/model_manager/metadata/metadata_base.py— removedconfigure_http_backendimport and callinvokeai/backend/model_manager/metadata/fetch/huggingface.py— removedconfigure_http_backend, added_model_info_via_session()fallback, added_has_custom_sessionflagWhy (root cause chain):
transformers>=5.1.0pulls inhuggingface_hub>=1.0.0as a dependency.huggingface_hub1.0 switched its HTTP backend fromrequeststohttpxand removed theconfigure_http_backend()function entirely.configure_http_backend(backend_factory=lambda: session)in two places to inject a customrequests.Session— this was used in production fordownload_urls()and, critically, in tests to inject aTestSessionwith mock HTTP adapters so tests could run without real network calls.HfApi()now useshttpxinternally and works fine for real HTTP). However, it broke the test suite:HfApi().model_info()now bypasses the mockrequests.TestSessionentirely and hits the real HuggingFace API, causingRepositoryNotFoundErrorfor test-only repos likeInvokeAI-test/textual_inversion_tests.HuggingFaceMetadataFetchnow tracks whether a custom session was injected (_has_custom_session). When true,from_id()calls a new_model_info_via_session()method that uses the injectedrequests.Sessionto query the HF API directly (matching the URL patterns the test mocks expect). When false (production), it usesHfApi()as before.python -c "from invokeai.backend.model_manager.metadata.fetch import HuggingFaceMetadataFetch"pytest tests/app/services/model_install/test_model_install.py -x -v— all 19 tests should pass, especiallytest_heuristic_import_with_type,test_huggingface_install, andtest_huggingface_repo_idwhich depend on mock HF API responses via the injected session.stabilityai/sd-turbo). Confirm the metadata (name, description, tags) is correctly fetched and displayed, and the model downloads successfully. This exercises the productionHfApi().model_info()path, plushf_hub_url()anddownload_urls().Change 5:
HfFolder→huggingface_hub.get_token()File:
invokeai/app/services/model_install/model_install_default.pyWhy:
HfFolderwas removed inhuggingface_hub1.0+. The replacement is the top-levelget_token()function.python -c "from invokeai.app.services.model_install.model_install_default import ModelInstallService"huggingface-cli login), try installing a gated model (e.g.black-forest-labs/FLUX.1-dev). Confirm the token is automatically injected and the download succeeds.get_token()returnsNonegracefully and the install proceeds.Change 6:
transformers>=5.1.0override for compelFile:
pyproject.toml—override-dependenciesWhy:
compel==2.1.1requirestransformers ~= 4.25(<5.0). The uv override forces past this constraint."a (red:1.5) car on a (blue:0.5) road". Compare to an unweighted"a red car on a blue road". The weighted version should show noticeably more red and less blue. This is the core compel functionality."a photo of a dog"and negative prompt"blurry, low quality". Confirm it generates without crash.SDXLCompelPromptInvocation)."a photo of a cat".blend("a photo of a dog", 0.5)or("a cat", "a dog").blend(0.5, 0.5). This exercises deeper compel internals.Change 7: T5 shared-weight assertion →
model.tie_weights()Files:
invokeai/backend/model_manager/load/model_loaders/flux.py—_load_state_dict_into_t5()classmethodinvokeai/backend/quantization/scripts/quantize_t5_xxl_bnb_llm_int8.py—load_state_dict_into_t5()functionWhy (root cause chain):
model.shared.weightandmodel.encoder.embed_tokens.weightshould refer to the same tensor.nn.Parameterobject, soa is bwasTrue._tie_weights()/tie_weights(). The two attributes may be distinctnn.Parameterobjects that are kept in sync by the framework, soa is bcan beFalse.model.load_state_dict(state_dict, strict=False, assign=True). Theassign=Trueflag replaces parameters in-place rather than copying data into existing tensors. This severs even the parameter-level tie that transformers 5.x establishes.model.encoder.embed_tokens.weight is model.shared.weight, which was guaranteedTruein 4.x but fails in 5.x afterassign=True.model.tie_weights(), which re-establishes the tie regardless of how it is internally implemented. This is forward-compatible and is the officially recommended approach.python -c "from invokeai.backend.model_manager.load.model_loaders.flux import FluxBnbQuantizednf4bCheckpointModel"_load_state_dict_into_t5(). The generation should complete withoutAssertionError. (Same as test 3b — this change and Change 3 are both exercised together.)FluxBnbQuantizednf4bCheckpointModelloader which also calls_load_state_dict_into_t5().quantize_t5_xxl_bnb_llm_int8.pyand confirm it completes without assertion errors.Change 8:
HFTokenHelper.get_status()— null token guard forget_token_permission()File:
invokeai/app/api/routers/model_manager.pyWhy (root cause chain):
HFTokenHelper.get_status()callshuggingface_hub.get_token_permission(huggingface_hub.get_token())to check whether a valid HF token is present.get_token()returnsNone.get_token_permission(None)returned a falsy value, so the code fell through toreturn HFTokenStatus.INVALID— correct behavior.get_token_permission(None)raises an exception (it now validates the input and rejectsNone).except Exceptioncatch returnedHFTokenStatus.UNKNOWN, which the frontend interprets as a network error, showing the misleading message: "Unable to Verify HF Token — Unable to verify HuggingFace token. This is likely due to a network error."get_token()forNonefirst and returnINVALIDimmediately, before ever callingget_token_permission(). This restores the correct "no token" UI message.~/.cache/huggingface/token), clear$env:HF_TOKEN, restart InvokeAI. The UI should show the proper "no token" message, not the "unable to verify / network error" message.huggingface-cli login), restart InvokeAI. The UI should show the token as valid.black-forest-labs/FLUX.1-dev). The UI should clearly indicate a token is needed, not a network error.Automated Test Suite
pytest ./tests -x -m "not slow"-xflag stops on first failure for quick feedback.pytest ./tests -x -m "slow"Quick Smoke Test Script (all imports at once)
Run this to verify none of the changed files crash on import:
If this prints
All imports OK, you've passed the baseline. Then proceed to the UI-based tests in priority order: 6a → 3b/7b → 3d → 4b → 5c → 2b → 1b.