Use custom rmsnorm in static attention #16604

lucylq · 2026-01-14T22:04:18Z

Summary

Use custom RMSNorm for static attention. This is what's used in mha attention as well.

executorch/examples/models/llama/attention.py

Lines 365 to 366 in 9cbe754

    
           self.q_norm_fn = RMSNorm(q_norm_dim, eps=args.norm_eps) 
        
           self.k_norm_fn = RMSNorm(k_norm_dim, eps=args.norm_eps)

Test plan

Exporting qwen with coreml. Qwen config sets use_qk_norm to true in the config, and then fails because of CoreML fp16 conversion error. Looks like the custom RMSNorm explicitly casts to fp32.

executorch/examples/models/llama/norm.py

Line 54 in 9cbe754

output = self._norm(x.float()).type_as(x)

pytorch-bot · 2026-01-14T22:04:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16604

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job, 1 Unrelated Failure

As of commit bcd5510 with merge base 9510334 ():

NEW FAILURE - The following job has failed:

Test CUDA Builds / test-model-cuda-e2e (mistralai, Voxtral-Mini-3B-2507, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 1dae9ee945e62e5a229839810c6341c44680f159274390eadcd6dd5808174bf4 /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

pull / test-samsung-models-linux / linux-job (gh)
##[error]The operation was canceled.

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / android / run-emulator (gh) (#16137)
Timeout waiting for emulator to boot.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-01-14T22:05:00Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

This PR updates the static attention implementation to use the custom RMSNorm class instead of PyTorch's built-in torch.nn.RMSNorm. The custom RMSNorm explicitly casts to fp32 during normalization, which resolves CoreML fp16 conversion errors when exporting models like Qwen that use QK normalization.

Changes:

Import the custom RMSNorm from executorch.examples.models.llama.norm
Replace torch.nn.RMSNorm with custom RMSNorm in StaticAttention.__init__() when use_qk_norm=True

Comments suppressed due to low confidence (2)

examples/models/llama/static_attention.py:862

The default parameter rms_norm_class=torch.nn.RMSNorm is inconsistent with the custom RMSNorm now used in __init__. When from_attention_mha is called without specifying rms_norm_class, it will create a StaticAttention instance with custom RMSNorm (via __init__), but then load_weights_from_attention_mha will replace them with torch.nn.RMSNorm instances, defeating the purpose of this PR. Change the default to RMSNorm to maintain consistency.

        rms_norm_class=torch.nn.RMSNorm,

examples/models/llama/static_attention.py:1101

The default parameter rms_norm_class=torch.nn.RMSNorm should be changed to RMSNorm to be consistent with the custom RMSNorm now used in __init__. This ensures that when loading weights, the norms use the same custom implementation that handles fp32 casting for CoreML compatibility.

        self, other: AttentionMHA, rms_norm_class=torch.nn.RMSNorm

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

metascroy · 2026-01-14T22:20:40Z

@sxu can you make sure these changes are OK?

Use custom rmsnorm in static attention

bcd5510

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 14, 2026

lucylq requested a review from metascroy January 14, 2026 22:07

lucylq marked this pull request as ready for review January 14, 2026 22:07

Copilot AI review requested due to automatic review settings January 14, 2026 22:07

Copilot started reviewing on behalf of lucylq January 14, 2026 22:08 View session

Copilot AI reviewed Jan 14, 2026

View reviewed changes

metascroy requested a review from sxu January 14, 2026 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use custom rmsnorm in static attention #16604

Use custom rmsnorm in static attention #16604

Uh oh!

lucylq commented Jan 14, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jan 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

metascroy commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	self.q_norm_fn = RMSNorm(q_norm_dim, eps=args.norm_eps)
	self.k_norm_fn = RMSNorm(k_norm_dim, eps=args.norm_eps)

Use custom rmsnorm in static attention #16604

Are you sure you want to change the base?

Use custom rmsnorm in static attention #16604

Uh oh!

Conversation

lucylq commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16604

❌ 1 New Failure, 1 Cancelled Job, 1 Unrelated Failure

Uh oh!

github-actions bot commented Jan 14, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

metascroy commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lucylq commented Jan 14, 2026 •

edited

Loading

pytorch-bot bot commented Jan 14, 2026 •

edited

Loading

This PR needs a `release notes:` label