Parser performance experiments (record) by JanJakes · Pull Request #426 · WordPress/sqlite-database-integration

JanJakes · 2026-06-06T15:36:06Z

Record-keeping branch consolidating the MySQL parser/lexer performance experiments explored while optimizing the pure-PHP parser. The shipped optimizations are in #378 (built on #373/#375/#376); the optional native Rust extension is #381/#423. This branch is the catalog of everything else that was prototyped, measured, or evaluated — most of it previously living only in throwaway local branches or ephemeral sessions.

Each experiment is its own directory (code where it exists + a NOTES.md: idea, how to run, result, verdict) and its own commit. Branch is based directly on trunk and only adds files under experiments/. Catalog + harness: be95d4b.

Numbers are parse-only on the 69,577-query MySQL server corpus, warm tracing JIT, MacBook Pro M4 / PHP 8.5.5 (≈10–15% drift; treat as ratios). Baselines: trunk parser ≈28K QPS, optimized parser (#378) ≈57K, pure-regex recognition ≈98K, validate-only ≈246K. "—" = not measured (proposal/reasoning only).

Draft, closed immediately — kept purely as a record.

Experiment	Result	Verdict	Commit
Whole-grammar PHP compilation	+18–20% no JIT, −6–8% with JIT	Wrong shape for the JIT	`c9b364f`
Method-size capping + runtime fallback	~0.92×	Always slower than the interpreter	`dfcf256`
Alternative AST data structures	array −5%, tape +144%	AST build is ~77% of parse time; no easy node-shape win	`0b998c1`
Pratt parser for the expression cascade	—	Estimated 5–25%; worth a try	`4fb6027`
LL(2) selectors	—	Estimated 5–15%; high effort	`eba6f84`
Table-driven LALR(1)	—	No clear win in PHP	`e050bd8`
pack / unpack for hot-path lookups	~4× slower lookups	Good for bulk decode only	`8a159ee`
Full PCRE grammar recognizer	~98K QPS, 99.85% recognized	Recognizer, not a parser	`c2cb3d9`
Regex pre-validate + parser hybrid	~50K vs ~65K	The gate is pure overhead	`0213f11`
Multi-shape regex → direct AST	~1.2× overall	Works; modest and orthogonal	`942d18e`
PCRE2 captures for AST extraction	~4.6× JIT collapse	Infeasible in stock PHP	`4ad1df5`
PCRE2 callouts via FFI	29K–314K by callout density	Works, but FFI rarely available	`03257ba`
Iterative preg_replace_callback shift-reduce	~2.5× slower	Per-call cost too high	`810a499`
Bottom-up reduction with binary encoding	~20–30K ceiling	Same wall as shift-reduce	`7b1311f`
Oniguruma `(?@...)` capture trees	31-group cap	Too small; unreachable in PHP	`38868e0`
strtr blind reduction	~2650× slower	Dead end — scans the whole table per call	`2af208b`
Native tree builders (json_decode/unserialize/DOM)	—	Circular: the transform is the parse	`90aa5dc`
parle PECL extension	—	Estimated 3–10×; needs PECL	`85476a7`
Other PHP parser libs (PHP-PEG/Hoa/Phlexy)	—	None beat the current parser	`ead2eb2`
SQLite as the parser	—	No parse tree; classifier at best	`071a8ca`
AST cache keyed by parameterized template	~2.4× on repeats	Win on repeats, loss on unique queries	`60d6035`
Native Rust parser extension	1.33× with JIT, 2.19× without	Only ~1.3× over optimized PHP under JIT	`fec4ae6`

Consolidates the parser/lexer performance experiments explored alongside the shipped optimizations (PR #378, built on #373/#375/#376). One directory and commit per approach; each has code and/or a NOTES.md with idea, method, result, verdict.

Compile every rule to a dedicated method. +18-20% without JIT, -6-8% under tracing JIT (huge methods exceed the JIT trace-length limit). From local branch _parser_perf.

Cap compiled method size and stub the rest back to the interpreter. Rescues the no-cap JIT loss but plateaus ~0.92x; never reaches parity. From local branch _parser_perf.

object vs validate-only vs array vs flat-int-tape nodes. Validate-only ceiling ~246K (AST build is ~77% of parse time); array -5%; an in-place-truncation tape builds ~2.4x faster but is not a usable tree.

Operator-precedence inner loop for the expr->...->simpleExpr chain. Estimated 5-25% on expression-heavy queries; not prototyped.

2-token lookahead to remove residual backtracking. Measured premise: 32.7% of rules are multi-candidate and absorb ~51% of parse calls. Estimated 5-15% at high cost; not prototyped.

kmyacc/nikic-style action-goto table interpreter. Reality check: hand-written RD (tolerant-php-parser) ~40% faster than kmyacc-LALR (nikic) on PHP source. Not prototyped.

pack/unpack loses ~4x on hot-path random lookups but wins ~5x on bulk decode.

76KB pattern, 1127 named subroutines, ~98K QPS, 99.85% recognized. A recognizer, not a parser. From local branch _parser_perf.

Slower than the parser alone (~50K vs ~65K): nearly all input is valid, so the gate is pure overhead. From local branch _parser_perf.

Per-shape PCRE2 union (*MARK) builds the tree directly for ~19% of queries; ~1.18x overall, byte-identical AST. From local branch parser-fast-path.

Three walls: compile complexity limit, ~4.6x JIT collapse on captures around recursion, ~26us export with ~1400 named groups. Infeasible in stock PHP. From local branch parser-fast-path.

Binding a PHP closure to pcre2_set_callout_8 works and yields a structural trace (~314K/63K/29K QPS by callout density). Corrects an earlier probe (dea9df7) that wrongly concluded callouts were blocked. Needs PHP 7.4+ FFI.

Mega-pattern of 4223 RHS alternatives. One no-op pass already ~2.5x slower than the parser; epsilon branches block bottom-up reduction.

Fixed-width binary encodings hit the same encoding-independent ~20-30K per-call floor; the 4-byte variant won't compile. Same wall, different direction.

ONIG_MAX_CAPTURE_HISTORY_GROUP=31 (far too small) and PHP mbstring exposes no capture-tree accessor. Source finding, not runnable in PHP.

strtr iterate-to-stable is ~2650x slower than hand RD: it scans the whole table per call. Dead end.

json_decode/unserialize/DOMDocument: any SQL->JSON/XML transform that encodes nesting is itself the parse. Structurally circular.

Native C++ LALR(1) (lexertl/parsertl), PHP 7.4+, PECL-only, non-serializable tables (per-cold-worker rebuild). Est. 3-10x where installable; not benchmarked.

PHP-PEG (packrat memo overhead), Hoa\Compiler (grammar interpreter), Phlexy (lexer-only). None likely to beat the optimized parser.

SQLite exposes no parse tree; EXPLAIN QUERY PLAN is an execution plan, not an AST. At most a syntactic accept/reject classifier.

Cache the AST on a parameterized token-stream signature. ~2-2.4x on repeat-heavy workloads, net loss on unique queries. Reference artifacts from local branch ast-cache.

The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.

JanJakes closed this Jun 6, 2026

JanJakes added 23 commits June 6, 2026 17:48

Add whole-grammar PHP compilation experiment

c9b364f

Compile every rule to a dedicated method. +18-20% without JIT, -6-8% under tracing JIT (huge methods exceed the JIT trace-length limit). From local branch _parser_perf.

Add method-size capping with runtime fallback experiment

dfcf256

Cap compiled method size and stub the rest back to the interpreter. Rescues the no-cap JIT loss but plateaus ~0.92x; never reaches parity. From local branch _parser_perf.

Add alternative AST data-structure experiment

0b998c1

object vs validate-only vs array vs flat-int-tape nodes. Validate-only ceiling ~246K (AST build is ~77% of parse time); array -5%; an in-place-truncation tape builds ~2.4x faster but is not a usable tree.

Add Pratt expression-cascade proposal

4fb6027

Operator-precedence inner loop for the expr->...->simpleExpr chain. Estimated 5-25% on expression-heavy queries; not prototyped.

Add LL(2) selectors proposal and supporting analysis

eba6f84

2-token lookahead to remove residual backtracking. Measured premise: 32.7% of rules are multi-candidate and absorb ~51% of parse calls. Estimated 5-15% at high cost; not prototyped.

Add table-driven LALR(1) proposal

e050bd8

kmyacc/nikic-style action-goto table interpreter. Reality check: hand-written RD (tolerant-php-parser) ~40% faster than kmyacc-LALR (nikic) on PHP source. Not prototyped.

Add packed-binary vs PHP-array lookup microbench

8a159ee

pack/unpack loses ~4x on hot-path random lookups but wins ~5x on bulk decode.

Add full-PCRE grammar recognizer experiment

c2cb3d9

76KB pattern, 1127 named subroutines, ~98K QPS, 99.85% recognized. A recognizer, not a parser. From local branch _parser_perf.

Add regex pre-validate + parser hybrid experiment

0213f11

Slower than the parser alone (~50K vs ~65K): nearly all input is valid, so the gate is pure overhead. From local branch _parser_perf.

Add multi-shape regex direct-AST fast parser

942d18e

Per-shape PCRE2 union (*MARK) builds the tree directly for ~19% of queries; ~1.18x overall, byte-identical AST. From local branch parser-fast-path.

Add PCRE2 capture/trace AST-extraction experiment

4ad1df5

Three walls: compile complexity limit, ~4.6x JIT collapse on captures around recursion, ~26us export with ~1400 named groups. Infeasible in stock PHP. From local branch parser-fast-path.

Add PCRE2-callouts-via-FFI experiment and correction

03257ba

Binding a PHP closure to pcre2_set_callout_8 works and yields a structural trace (~314K/63K/29K QPS by callout density). Corrects an earlier probe (dea9df7) that wrongly concluded callouts were blocked. Needs PHP 7.4+ FFI.

Add preg_replace_callback shift-reduce experiment

810a499

Mega-pattern of 4223 RHS alternatives. One no-op pass already ~2.5x slower than the parser; epsilon branches block bottom-up reduction.

Add binary-encoded bottom-up reduction experiment

7b1311f

Fixed-width binary encodings hit the same encoding-independent ~20-30K per-call floor; the 4-byte variant won't compile. Same wall, different direction.

Add Oniguruma capture-trees finding

38868e0

ONIG_MAX_CAPTURE_HISTORY_GROUP=31 (far too small) and PHP mbstring exposes no capture-tree accessor. Source finding, not runnable in PHP.

Add strtr blind-reduction experiment

2af208b

strtr iterate-to-stable is ~2650x slower than hand RD: it scans the whole table per call. Dead end.

Add native tree-builders reasoning

90aa5dc

json_decode/unserialize/DOMDocument: any SQL->JSON/XML transform that encodes nesting is itself the parse. Structurally circular.

Add parle PECL extension proposal

85476a7

Native C++ LALR(1) (lexertl/parsertl), PHP 7.4+, PECL-only, non-serializable tables (per-cold-worker rebuild). Est. 3-10x where installable; not benchmarked.

Add survey of other PHP parser libraries

ead2eb2

PHP-PEG (packrat memo overhead), Hoa\Compiler (grammar interpreter), Phlexy (lexer-only). None likely to beat the optimized parser.

Add SQLite-as-parser proposal

071a8ca

SQLite exposes no parse tree; EXPLAIN QUERY PLAN is an execution plan, not an AST. At most a syntactic accept/reject classifier.

Add AST cache keyed by parameterized template

60d6035

Cache the AST on a parameterized token-stream signature. ~2-2.4x on repeat-heavy workloads, net loss on unique queries. Reference artifacts from local branch ast-cache.

Add native Rust extension write-up

fec4ae6

The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.

JanJakes reopened this Jun 6, 2026

JanJakes force-pushed the performance-experiments branch from a6be40b to fec4ae6 Compare June 6, 2026 15:55

JanJakes closed this Jun 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser performance experiments (record)#426

Parser performance experiments (record)#426
JanJakes wants to merge 23 commits into
trunkfrom
performance-experiments

JanJakes commented Jun 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JanJakes commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JanJakes commented Jun 6, 2026 •

edited

Loading