Improve CPU float64 exp SIMD accuracy#3740
Draft
chrismicah wants to merge 1 commit into
Draft
Conversation
Collaborator
|
We have been experimenting with a google/highway based SIMD implementation (#3019) and the SIMD code will go though a thorough rewrite, so while this is nice to have, we probably won't accepts PRs on it recently. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is narrow partial progress for #3047: it improves CPU
float64accuracy forexpby adding a double-precision SIMD polynomial path inmlx/backend/cpu/simd/math.h.It intentionally does not claim to fix the other fp64 unary ops mentioned in #3047 (
sin,cos,erf,erfinv). I checked the prior closed PR #3058 and kept this patch aligned with the maintainer feedback there: this is not a scalar element-wisestd::expfallback.Evidence
Pre-change, CPU
float64exp([0.1, 0.5, 0.9, 1.25])showed fp32-scale error from another local checkout at0.32.0.dev20260621+5abdd04b:2.783627603974992e-07Post-change, after rebuilding the local editable MLX package from this branch:
python -m pip install -e . --no-build-isolationpassedfloat64exp max absolute error:4.440892098500626e-161.3877787807814457e-17python -m pytest python/tests/test_double.py -qpassed:10 passedpython -m pytest python/tests/test_ops.py -q -k 'exp or softmax'passed:7 passed, 134 deselectedI also added a C++
ops_tests.cppregression that checksfloat64expagainststd::expat1e-12tolerance. I could not run the C++ doctest binary locally because the editable build tree did not include the C++ test executable / doctest headers, so this is draft until that is covered by CI or a local C++ test build.Notes