Skip to content

Improve CPU float64 exp SIMD accuracy#3740

Draft
chrismicah wants to merge 1 commit into
ml-explore:mainfrom
chrismicah:codex/cpu-fp64-exp-simd
Draft

Improve CPU float64 exp SIMD accuracy#3740
chrismicah wants to merge 1 commit into
ml-explore:mainfrom
chrismicah:codex/cpu-fp64-exp-simd

Conversation

@chrismicah

Copy link
Copy Markdown
Contributor

Summary

This is narrow partial progress for #3047: it improves CPU float64 accuracy for exp by adding a double-precision SIMD polynomial path in mlx/backend/cpu/simd/math.h.

It intentionally does not claim to fix the other fp64 unary ops mentioned in #3047 (sin, cos, erf, erfinv). I checked the prior closed PR #3058 and kept this patch aligned with the maintainer feedback there: this is not a scalar element-wise std::exp fallback.

Evidence

Pre-change, CPU float64 exp([0.1, 0.5, 0.9, 1.25]) showed fp32-scale error from another local checkout at 0.32.0.dev20260621+5abdd04b:

  • max absolute error: 2.783627603974992e-07

Post-change, after rebuilding the local editable MLX package from this branch:

  • python -m pip install -e . --no-build-isolation passed
  • CPU float64 exp max absolute error: 4.440892098500626e-16
  • softmax-like fp64 exp-normalization max absolute error: 1.3877787807814457e-17
  • python -m pytest python/tests/test_double.py -q passed: 10 passed
  • python -m pytest python/tests/test_ops.py -q -k 'exp or softmax' passed: 7 passed, 134 deselected

I also added a C++ ops_tests.cpp regression that checks float64 exp against std::exp at 1e-12 tolerance. I could not run the C++ doctest binary locally because the editable build tree did not include the C++ test executable / doctest headers, so this is draft until that is covered by CI or a local C++ test build.

Notes

@zcbenz

zcbenz commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

We have been experimenting with a google/highway based SIMD implementation (#3019) and the SIMD code will go though a thorough rewrite, so while this is nice to have, we probably won't accepts PRs on it recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants