Skip to content

Add initial tuning for M5 pro and max#3211

Open
jagrit06 wants to merge 2 commits intomainfrom
m5-max-tuning
Open

Add initial tuning for M5 pro and max#3211
jagrit06 wants to merge 2 commits intomainfrom
m5-max-tuning

Conversation

@jagrit06
Copy link
Member

@jagrit06 jagrit06 commented Mar 5, 2026

Proposed changes

Adding kernel tunings for larger M5 devices

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

@jagrit06 jagrit06 marked this pull request as ready for review March 10, 2026 22:50
@jagrit06
Copy link
Member Author

Here are some brief numbers on M5 max:

  B,     M,     N,     K,   dtype,  t,  tflops_mx
 
  1,  4096,  4096,  4096, float16, nn,     58.196
 
  1,  4096, 14336,  4096, float16, nn,     59.639
 
  1,  4096,  4096, 14336, float16, nn,     54.875
 
  1,  4096,  4096,  4096, float16, nt,     58.375
 
  1,  4096, 14336,  4096, float16, nt,     58.147
 
  1,  4096,  4096, 14336, float16, nt,     52.706

Split-K mammals need separate work and will benefit from splitting at smaller Ks
Will update those in a follow up PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant