Skip to content

Add GPT-J architecture adapter tests#1314

Merged
jlarson4 merged 1 commit into
TransformerLensOrg:devfrom
along-l:gptj-adapter-test
May 19, 2026
Merged

Add GPT-J architecture adapter tests#1314
jlarson4 merged 1 commit into
TransformerLensOrg:devfrom
along-l:gptj-adapter-test

Conversation

@along-l
Copy link
Copy Markdown

@along-l along-l commented May 19, 2026

Description

Add unit tests for GptjArchitectureAdapter, contributing to the architecture adapter test coverage effort in #1302.

Coverage includes:

  • Config attribute validation (6 required architecture flags)
  • Component mapping structure (top-level + block-level + attn/mlp submodules)
  • Weight conversion keys and rearrange patterns
  • Architecture guards (no pos_embed, no top-level rotary_emb, no ln2, no MLP gate, exactly QKVO conversion keys)
  • Factory registration (GPTJForCausalLMGptjArchitectureAdapter)

49 tests, all passing locally via uv run pytest.

Related to #1302

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@along-l
Copy link
Copy Markdown
Author

along-l commented May 19, 2026

Hi! Just flagging that the Othello_GPT notebook check failure seems unrelated to this PR. Cell 4 only does import transformer_lens, and the change here is just a new file under tests/unit/, which doesn't touch that import path. The same check also passed yesterday on the sibling adapter-test PRs #1309#1312. Sharing in case it's helpful — happy to dig in further if useful. Thanks!

@jlarson4
Copy link
Copy Markdown
Collaborator

This looks great @along-l! The Othello failure is related specifically to HuggingFace rate limits that occur when too many PRs are running unit tests at the same time. I restarted the test when the queue was clear and it passed as expected. Merging this, thank you for taking it on!

We do have a PR open that has begun to address this specific issue (Issue is #1291, open PR is #1296), but that PR does not cover everything that could be done to reduce our overall HF calls, if you are interested in digging into it feel free, just take care not to conflict with the work done in #1296.

@jlarson4 jlarson4 merged commit 5fc97cc into TransformerLensOrg:dev May 19, 2026
47 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants