Skip to content

[mypyc] Add librt.strings.isalnum codepoint primitive#21509

Merged
p-sawicki merged 2 commits into
python:masterfrom
VaggelisD:librt-strings-isalnum
May 19, 2026
Merged

[mypyc] Add librt.strings.isalnum codepoint primitive#21509
p-sawicki merged 2 commits into
python:masterfrom
VaggelisD:librt-strings-isalnum

Conversation

@VaggelisD
Copy link
Copy Markdown
Contributor

3rd PR for #21418, mirroring librt.strings.isdigit.

Measured on a microbenchmark this is roughly 30-40% faster for a char

Wraps `Py_UNICODE_ISALNUM` for the codepoint fast path, mirroring the
already-merged `librt.strings.isspace` (python#21462) and `isdigit` (python#21504).

Microbenchmark, both paths mypyc-compiled, scanning 2.5M codepoints
per call: `s[i].isalnum()` runs at ~6.1 ns/codepoint; the codepoint
path `c: i32 = i32(ord(s[i])); isalnum(c)` at ~4.8 ns/codepoint,
roughly 1.3x faster. The gain is larger inside tokenizer-style loops
that mix `isalnum` with literal-i32 compares (no per-character `str`
materialization at all).
@github-actions

This comment has been minimized.

o = ord(c)
assert isspace(o) == isspace(i) == a.isspace()
assert isdigit(o) == isdigit(i) == a.isdigit()
assert isalnum(o) == isalnum(i) == a.isalnum()
Copy link
Copy Markdown
Collaborator

@p-sawicki p-sawicki May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we're missing coverage for calling these functions through the python wrappers because here they are transformed into direct C function calls.

could you add a driver.py in this test and call them with a couple of values? doesn't have to be the entire space like in the compiled file. would be good to also test the exception raised when the codepoint is outside of int32 range.

edit: or instead of driver.py wrap the librt functions with Any variables and call through the wrapper. we have a couple of examples in other tests like this.

The existing run-test for the codepoint classifiers exercises only the
compiled fast path: mypyc rewrites `isspace(c)` / `isdigit(c)` /
`isalnum(c)` into direct calls to the underlying C symbols, so the
PyMethodDef wrappers (`cp_isspace`, `cp_isdigit`, `cp_isalnum`) and
their i32 range check never get exercised by the existing test.

Iterate the librt functions in a tuple so the callee is opaque to
mypyc and dispatch falls back to the generic path, hitting the
PyMethodDef wrappers. Also assert the OverflowError raised by the
wrappers' `cp_parse_i32` for inputs outside i32 range.
@github-actions
Copy link
Copy Markdown
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Copy link
Copy Markdown
Collaborator

@p-sawicki p-sawicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@p-sawicki p-sawicki merged commit 739a652 into python:master May 19, 2026
25 checks passed
alicederyn pushed a commit to alicederyn/mypy that referenced this pull request May 20, 2026
3rd PR for python#21418, mirroring
`librt.strings.isdigit`.

Measured on a microbenchmark this is roughly 30-40% faster for a char
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants