Skip to content

Parser performance experiments (record)#426

Closed
JanJakes wants to merge 23 commits into
trunkfrom
performance-experiments
Closed

Parser performance experiments (record)#426
JanJakes wants to merge 23 commits into
trunkfrom
performance-experiments

Conversation

@JanJakes

@JanJakes JanJakes commented Jun 6, 2026

Copy link
Copy Markdown
Member

Record-keeping branch consolidating the MySQL parser/lexer performance experiments explored while optimizing the pure-PHP parser. The shipped optimizations are in #378 (built on #373/#375/#376); the optional native Rust extension is #381/#423. This branch is the catalog of everything else that was prototyped, measured, or evaluated — most of it previously living only in throwaway local branches or ephemeral sessions.

Each experiment is its own directory (code where it exists + a NOTES.md: idea, how to run, result, verdict) and its own commit. Branch is based directly on trunk and only adds files under experiments/. Catalog + harness: be95d4b.

Numbers are parse-only on the 69,577-query MySQL server corpus, warm tracing JIT, MacBook Pro M4 / PHP 8.5.5 (≈10–15% drift; treat as ratios). Baselines: trunk parser ≈28K QPS, optimized parser (#378) ≈57K, pure-regex recognition ≈98K, validate-only ≈246K. "—" = not measured (proposal/reasoning only).

Draft, closed immediately — kept purely as a record.

ExperimentResultVerdictCommit
Whole-grammar PHP compilation+18–20% no JIT, −6–8% with JITWrong shape for the JITc9b364f
Method-size capping + runtime fallback~0.92×Always slower than the interpreterdfcf256
Alternative AST data structuresarray −5%, tape +144%AST build is ~77% of parse time; no easy node-shape win0b998c1
Pratt parser for the expression cascadeEstimated 5–25%; worth a try4fb6027
LL(2) selectorsEstimated 5–15%; high efforteba6f84
Table-driven LALR(1)No clear win in PHPe050bd8
pack / unpack for hot-path lookups~4× slower lookupsGood for bulk decode only8a159ee
Full PCRE grammar recognizer~98K QPS, 99.85% recognizedRecognizer, not a parserc2cb3d9
Regex pre-validate + parser hybrid~50K vs ~65KThe gate is pure overhead0213f11
Multi-shape regex → direct AST~1.2× overallWorks; modest and orthogonal942d18e
PCRE2 captures for AST extraction~4.6× JIT collapseInfeasible in stock PHP4ad1df5
PCRE2 callouts via FFI29K–314K by callout densityWorks, but FFI rarely available03257ba
Iterative preg_replace_callback shift-reduce~2.5× slowerPer-call cost too high810a499
Bottom-up reduction with binary encoding~20–30K ceilingSame wall as shift-reduce7b1311f
Oniguruma (?@...) capture trees31-group capToo small; unreachable in PHP38868e0
strtr blind reduction~2650× slowerDead end — scans the whole table per call2af208b
Native tree builders (json_decode/unserialize/DOM)Circular: the transform is the parse90aa5dc
parle PECL extensionEstimated 3–10×; needs PECL85476a7
Other PHP parser libs (PHP-PEG/Hoa/Phlexy)None beat the current parseread2eb2
SQLite as the parserNo parse tree; classifier at best071a8ca
AST cache keyed by parameterized template~2.4× on repeatsWin on repeats, loss on unique queries60d6035
Native Rust parser extension1.33× with JIT, 2.19× withoutOnly ~1.3× over optimized PHP under JITfec4ae6

@JanJakes JanJakes closed this Jun 6, 2026
JanJakes added 23 commits June 6, 2026 17:48
Consolidates the parser/lexer performance experiments explored alongside the shipped optimizations (PR #378, built on #373/#375/#376). One directory and commit per approach; each has code and/or a NOTES.md with idea, method, result, verdict.
Compile every rule to a dedicated method. +18-20% without JIT, -6-8% under tracing JIT (huge methods exceed the JIT trace-length limit). From local branch _parser_perf.
Cap compiled method size and stub the rest back to the interpreter. Rescues the no-cap JIT loss but plateaus ~0.92x; never reaches parity. From local branch _parser_perf.
object vs validate-only vs array vs flat-int-tape nodes. Validate-only ceiling ~246K (AST build is ~77% of parse time); array -5%; an in-place-truncation tape builds ~2.4x faster but is not a usable tree.
Operator-precedence inner loop for the expr->...->simpleExpr chain. Estimated 5-25% on expression-heavy queries; not prototyped.
2-token lookahead to remove residual backtracking. Measured premise: 32.7% of rules are multi-candidate and absorb ~51% of parse calls. Estimated 5-15% at high cost; not prototyped.
kmyacc/nikic-style action-goto table interpreter. Reality check: hand-written RD (tolerant-php-parser) ~40% faster than kmyacc-LALR (nikic) on PHP source. Not prototyped.
pack/unpack loses ~4x on hot-path random lookups but wins ~5x on bulk decode.
76KB pattern, 1127 named subroutines, ~98K QPS, 99.85% recognized. A recognizer, not a parser. From local branch _parser_perf.
Slower than the parser alone (~50K vs ~65K): nearly all input is valid, so the gate is pure overhead. From local branch _parser_perf.
Per-shape PCRE2 union (*MARK) builds the tree directly for ~19% of queries; ~1.18x overall, byte-identical AST. From local branch parser-fast-path.
Three walls: compile complexity limit, ~4.6x JIT collapse on captures around recursion, ~26us export with ~1400 named groups. Infeasible in stock PHP. From local branch parser-fast-path.
Binding a PHP closure to pcre2_set_callout_8 works and yields a structural trace (~314K/63K/29K QPS by callout density). Corrects an earlier probe (dea9df7) that wrongly concluded callouts were blocked. Needs PHP 7.4+ FFI.
Mega-pattern of 4223 RHS alternatives. One no-op pass already ~2.5x slower than the parser; epsilon branches block bottom-up reduction.
Fixed-width binary encodings hit the same encoding-independent ~20-30K per-call floor; the 4-byte variant won't compile. Same wall, different direction.
ONIG_MAX_CAPTURE_HISTORY_GROUP=31 (far too small) and PHP mbstring exposes no capture-tree accessor. Source finding, not runnable in PHP.
strtr iterate-to-stable is ~2650x slower than hand RD: it scans the whole table per call. Dead end.
json_decode/unserialize/DOMDocument: any SQL->JSON/XML transform that encodes nesting is itself the parse. Structurally circular.
Native C++ LALR(1) (lexertl/parsertl), PHP 7.4+, PECL-only, non-serializable tables (per-cold-worker rebuild). Est. 3-10x where installable; not benchmarked.
PHP-PEG (packrat memo overhead), Hoa\Compiler (grammar interpreter), Phlexy (lexer-only). None likely to beat the optimized parser.
SQLite exposes no parse tree; EXPLAIN QUERY PLAN is an execution plan, not an AST. At most a syntactic accept/reject classifier.
Cache the AST on a parameterized token-stream signature. ~2-2.4x on repeat-heavy workloads, net loss on unique queries. Reference artifacts from local branch ast-cache.
The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.
@JanJakes JanJakes reopened this Jun 6, 2026
@JanJakes JanJakes force-pushed the performance-experiments branch from a6be40b to fec4ae6 Compare June 6, 2026 15:55
@JanJakes JanJakes closed this Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant