Skip to content

Parser + lexer performance: consolidated 2–3× end-to-end speedup#378

Draft
JanJakes wants to merge 30 commits into
trunkfrom
performance
Draft

Parser + lexer performance: consolidated 2–3× end-to-end speedup#378
JanJakes wants to merge 30 commits into
trunkfrom
performance

Conversation

@JanJakes

@JanJakes JanJakes commented Apr 28, 2026

Copy link
Copy Markdown
Member

Summary

Consolidates the parser optimisations from #373, the lexer + token-construction wins from #375, and the has_child() micro-opt from #376 into a clean, linear history (one shippable change per commit). On top of those it adds a grammar-construction speedup and reworks the native (Rust) parser to materialise its AST eagerly so the parse-node class can be final.

End-to-end (lex+parse) on the 69,577-query MySQL server corpus, pure PHP, best across 3 ABAB-alternated rounds × 5 timed iterations (2 warmup iters per round to heat the tracing JIT):

Config Trunk This branch Speedup
Lex-only, JIT 188,323 QPS 369,687 QPS 1.96×
Parse-only, JIT 27,091 QPS 56,691 QPS 2.09×
End-to-end, JIT 23,855 QPS 50,061 QPS 2.10×
End-to-end, no opcache 10,722 QPS 28,362 QPS 2.65×

PHP 8.5; PHP 8.1 verified within ~5%. Tracing JIT is per-worker and shared across requests, so steady-state traffic hits warm JIT (the 2 warmup iters model that); cold first-request numbers are roughly 1.2× for the JIT configs.

Per-process startup cost and memory

The speedup comes from precomputing per-token FIRST sets and branch selectors. The grammar is built once per worker process (PHP's shared-nothing model resets statics between requests) and cached in self::$mysql_grammar, so the cost is amortised across every query in a request — but short requests pay it up front:

Trunk This branch Δ
Grammar construction ~0.8 ms ~3.7 ms +~2.9 ms
Grammar memory, resident at startup ~1.4 MiB ~9.0 MiB +~7.6 MiB
Peak after parsing the full corpus ~47 MiB ~66 MiB +~19 MiB

A naive build of these structures cost ~40 ms. A worklist fixpoint for FIRST/NULLABLE (recompute a rule only when a referenced rule grows) plus lazy per-rule selector denormalisation (a typical request touches ~7% of rules; the rest is deferred and amortised into parsing) cut construction to the ~3.7 ms above. Numbers are plain CLI — opcache trims construction further. The extra resident memory is a real tradeoff on memory-constrained shared hosts (a supported no-opcache target).

Native parser: eager AST materialisation

Marking WP_Parser_Node final (a measured +7% win) is incompatible with WP_MySQL_Native_Parser_Node subclassing it. Rather than drop final, the native parser now materialises its arena AST into plain WP_Parser_Node instances at parse time. This removes the entire lazy layer — the per-AST identity cache, the wrapper registry, ~18 wp_sqlite_mysql_native_ast_* bridge functions, and the WP_MySQL_Native_Parser_Node class (~600 LOC Rust + 179 LOC PHP), plus the two wrapper-lifetime test files they backed.

Because the translator walks essentially the whole AST for every query, eager materialisation removes per-node FFI round trips and is neutral-to-faster while far simpler: building the full tree, the native parser runs at ~60,100 QPS vs ~28,600 QPS pure PHP (2.1×) on the corpus. It regresses only a hypothetical consumer that parses but inspects a tiny fraction of a large tree.

The grammar is exchanged with the extension as a runtime ABI, so load.php now pins the supported extension minor line and falls back cleanly to pure PHP on a mismatch (e.g. a plugin update outpacing the installed binary) instead of failing at parse time.

What was kept from #373 / #375 / #376

Cost vs benefit (src LOC, end-to-end JIT)

  • Eight parser optimisations carry ~95% of the ~2× win for ~278 LOC — biggest: per-token FIRST sets to skip unreachable branches (+40%), inline terminal matching + deferred node allocation (+30%), inline single-branch fragments + short-circuit nullable fallback (+16%), final parse-node class (+7%), parent-constructor bypass (+5%).
  • Three lexer dispatch fast paths: +13% end-to-end / +55% lex-only for +42 LOC (inline whitespace skip, identifier/keyword arm at the top of the chain, single-byte operator dispatch table). Lex is only ~13% of parse time under JIT, so end-to-end dilutes the lex gain.
  • Four small opts (~127 LOC, ~+6% combined): end-of-input sentinel token, embedded branch-symbol sequences, rule-id (vs name) comparison, single-candidate direct-return.
  • Zero under JIT but kept: lexer byte-comparison primitives (free for non-JIT shared hosts; the dispatch fast paths build on them), the has_child() one-liner, CS whitespace re-alignment.

Test plan

  • composer run test (mysql-on-sqlite) — 721/721, 1,533,741 assertions (2 skipped, 2 incomplete), incl. WP_Parser_Grammar_Tests for the build-time transforms (epsilon stripping, fragment inlining + cycle termination, FIRST/NULLABLE selectors, single-candidate classification, merge_sorted)
  • composer run check-cs — clean
  • CI matrix PHP 7.2–8.5 (incl. PHP 7.2 / SQLite 3.27.0), the native-extension jobs, WordPress PHPUnit/E2E, WASM, and MySQL Proxy — all green
  • PHP 8.5 + PHP 8.1 verified within ~5%
  • Native-extension benchmark confirms the eager-materialisation change (~2.1× pure PHP)

@JanJakes JanJakes force-pushed the performance branch 3 times, most recently from 1c666e2 to c4aff56 Compare April 29, 2026 07:36
@JanJakes JanJakes changed the title Parser + lexer performance: consolidated 2.4× end-to-end speedup Parser + lexer performance: consolidated 2–3× end-to-end speedup Apr 29, 2026
@JanJakes JanJakes marked this pull request as ready for review April 29, 2026 15:04
@JanJakes JanJakes requested a review from adamziel April 29, 2026 15:04
);

while ( true ) {
if (

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (
// Break on file end
if (

@JanJakes JanJakes May 4, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in f9172e1.

while ( true ) {
if (
self::EOF === $this->token_type
|| ( null === $this->token_type && $this->bytes_already_read > 0 )

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't EOF cover that?

@JanJakes JanJakes May 4, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in f9172e1. EOF and the second arm catch different cases: self::EOF is set when read_next_token() sees a null byte at the start of a token (clean end-of-input). The null === $this->token_type && $this->bytes_already_read > 0 arm catches the case where read_next_token() returned null mid-stream because of an invalid byte. The > 0 guard keeps the very first iteration alive — at that point $this->token_type is still null because nothing has been read yet, not because we've failed.

$next_byte = $this->sql[ $this->bytes_already_read + 1 ] ?? null;

if ( "'" === $byte || '"' === $byte || '`' === $byte ) {
// A map for a single-byte symbol fast path.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

(
( $byte >= 'a' && $byte <= 'z' )
|| ( $byte >= 'A' && $byte <= 'Z' )
|| $byte > "\x7F"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave a comment on why \x7F is special here

@JanJakes JanJakes May 4, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in f9172e1.

|| ( $byte >= 'A' && $byte <= 'Z' )
|| $byte > "\x7F"
)
&& "'" !== $next_byte

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just ' and not "? Would any quotes-related sql mode/session options have impact here?

@JanJakes JanJakes May 4, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in f9172e1.

$type = $this->read_line_comment();
} elseif ( null !== $byte && strspn( $byte, self::WHITESPACE_MASK ) > 0 ) {
} elseif (
' ' === $byte

@adamziel adamziel May 3, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would array + isset() be faster?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marginally faster, but this branch rarely fires. next_token() and remaining_tokens() inline-skip whitespace before calling read_next_token() (commit f5b8932), so this arm only handles whitespace that appears between comments. Keeping the === chain for consistency with the rest of the dispatch.

&& 'x' === $next_byte
&& null !== $third_byte
&& strspn( $third_byte, self::HEX_DIGIT_MASK ) > 0
&& false !== strpos( self::HEX_DIGIT_MASK, $third_byte )

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clever

* a parse (sub)tree at each level of the full grammar tree.
*/
class WP_Parser_Node {
final class WP_Parser_Node {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does final make it faster somehow?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — final lets opcache/JIT skip the vtable check on method calls. Measured at +7% end-to-end, see commit daa4185 and the "Big, robust wins" table in the PR description.

$this->grammar = $grammar;
$this->token_count = count( $tokens );
// Append an end-of-input sentinel token whose id is EMPTY_RULE_ID
// (0). The hot path can then read $tokens[$pos]->id unconditionally

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty cool

// The INTO negative-lookahead only fires for selectStatement. Cache
// the rule id so the per-call check is an int compare instead of a
// string compare.
$this->select_statement_rule_id = $grammar->get_or_cache_rule_id( 'selectStatement' );

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any memory impact of caching all the rules?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negligible. Those are array assignments, not copies — PHP arrays are copy-on-write, so the parser instance just holds references to the grammar's arrays. No actual duplication unless something writes to them, which the parser doesn't.

@adamziel

adamziel commented May 3, 2026

Copy link
Copy Markdown
Collaborator

I've left some notes. Haven't read deeply into the diff but the idea makes sense – inline some stuff, reorder, cache, add a trailing token. Nothing revolutionary, but it's still a pretty clever way to get more juice out of it.

@JanJakes JanJakes force-pushed the performance branch 2 times, most recently from 232abea to 8c11f76 Compare May 4, 2026 15:52
@adamziel

adamziel commented May 4, 2026

Copy link
Copy Markdown
Collaborator

If the native parser is reworked to not extend WP_Parser_Node in the future, this can be restored.

Let's file an issue and explore that, 7% is huge

@JanJakes JanJakes force-pushed the performance branch 5 times, most recently from 8a7cf51 to 513004e Compare June 4, 2026 19:03
JanJakes added 4 commits June 5, 2026 09:43
Hot-path changes in WP_Parser::parse_recursive():

- Inline the terminal match in the branch loop instead of recursing into
  parse_recursive() for every token. Over the full MySQL test suite this
  eliminates ~1.6M function calls.
- Hoist grammar, rules, fragment_ids, rule_names, tokens, and token_count
  into local variables so the inner loops avoid repeated property lookups
  on $this->grammar.
- Cache the token count on the instance to avoid a count() per call.
- Build branch children in a local array and only instantiate the
  WP_Parser_Node once the branch has matched; on the MySQL corpus ~75% of
  speculative nodes were previously created and thrown away.
- Drop a dead is_array($subnode) check that never fires in practice
  (subnodes are false, true, tokens, or nodes - never arrays).
- Inline fragment inlining: read the fragment's children directly instead
  of building a fragment node and immediately merging it.

End-to-end parser benchmark on the MySQL server test corpus:
  Before: ~11,500 QPS   After: ~14,900 QPS  (+29%)
The grammar now precomputes FIRST and NULLABLE via fixpoint, then indexes
each rule's branches by the tokens that can start them. At parse time the
parser jumps straight to the candidate branches for the current token
instead of iterating every branch and letting most fail.

On the full MySQL test suite, 59% of branch attempts previously failed
because the first token could never match the branch's FIRST set; with
per-branch lookahead those attempts are eliminated.

End-to-end parser benchmark:
  Before: ~14,900 QPS   After: ~22,400 QPS  (+50%)
Two grammar/parser refinements that both reduce recursive calls:

* In parse_recursive(): when the rule has a per-token branch selector but
  the current token is not in any branch's FIRST and the rule itself is
  nullable, return 'matched empty' immediately instead of descending into
  nullable branches that would recursively do the same thing. This alone
  eliminates ~460k recursive calls on the MySQL corpus.

* At grammar build time, expand every single-branch fragment rule into
  its call sites. Fragments exist only to factor shared sub-sequences and
  their children are already flattened into the parent AST node, so
  splicing them directly into parent branches is a no-op for the
  resulting tree but removes an entire recursive call per use. 480 of the
  grammar's fragments qualify.

Also drops the dead terminal branch at the top of parse_recursive() (the
branch loop inlines terminal matching, so parse_recursive is only ever
called with non-terminal rule ids) and the always-false empty-branches
guard.

End-to-end parser benchmark:
  Before: ~22,400 QPS   After: ~27,500 QPS  (+23%)
Two minor reductions in per-call work:

* Strip explicit EMPTY_RULE_ID symbols out of rule branches at grammar
  build time. The parser loop would have 'continue'd over them anyway, so
  removing them ahead of time lets the hot symbol loop drop the epsilon
  check. Pure-epsilon branches become empty branches and still match
  empty via the existing empty-children fast path.

* Cache the grammar's rules, fragment_ids, rule_names, branches_for_token,
  nullable_branches, and highest_terminal_id as direct parser instance
  fields so parse_recursive() no longer pays for a $this->grammar->...
  double hop on every call.

* Collapse the two-step node construction (new + set_children) into a
  single constructor call that takes the children array directly. This
  saves a method call per allocated node (~820k across the MySQL corpus).

End-to-end parser benchmark: ~27,500 QPS -> ~28,500 QPS (+3.5%).
JanJakes added 10 commits June 5, 2026 10:11
Three review-noted spots that were terse in the code:

- The remaining_tokens() loop guard now spells out why both EOF
  and `null === token_type && bytes_already_read > 0` are needed
  (EOF on clean end-of-input vs invalid byte mid-stream, with
  the `> 0` guard letting the very first iteration through).

- The identifier/keyword fast path now explains `$byte > "\x7F"`
  (UTF-8 multi-byte starter; MySQL identifiers allow U+0080-U+FFFF)
  and `next_byte !== "'"` (only single quotes form the special
  hex/bin/n-char literal starters; `"` never does, regardless of
  SQL mode).

No behavior change.
The leading-whitespace skip at the top of read_next_token() was
already unrolled into byte-equality checks for the perf reasons
documented in 916b512. Apply the same unroll to the third-byte
whitespace check that gates a '--' as a line-comment start, so the
hot dispatch chain doesn't fall back into strpos() on a 5-char mask
for this case. The bound check is folded into '?? null' on the
third-byte read, matching the rest of the lookahead style.
The end-of-input sentinel that the parser hot path relies on must be
appended whenever the token stream is (re)assigned, not only at
construction time. Trunk's WP_MySQL_Parser::reset_tokens() didn't know
about it, so reusing a parser across queries left the parser walking
off the end of the array.

Move the sentinel append, $token_count compute, and $position reset
into a single protected set_tokens() helper on WP_Parser. The
constructor and the WP_MySQL_Parser::reset_tokens() override both call
it, so the invariant has one source of truth.
The pure-PHP parser was rewritten to use the precise per-token
branches_for_token + nullable_branches pair (replacing the earlier
coarse lookahead_is_match_possible map). Update the native (Rust)
parser to consume the same two fields directly:

- mysql-rust-bridge.php exports the new fields verbatim and stops
  producing the legacy lookahead view.

- The Rust extension parses branches_for_token's outer key set into
  a per-rule FIRST set (the inner branch sequences are pure-PHP
  parser detail and aren't relevant here) and tracks nullable as a
  separate bool on Rule, replacing the "0 in lookahead" trick. The
  early-bailout check is unchanged in spirit.

No PHP-side compatibility shim survives - the native bridge is now
in lock-step with the grammar's actual fields.
Trunk's WP_MySQL_Native_Parser_Node was a lazy-materialization wrapper
that extended WP_Parser_Node and overrode 18 read methods to delegate
into the Rust-owned arena until first mutation. The performance branch
needs WP_Parser_Node to be 'final' for opcache/JIT specialization, and
PHP forbids extending a final class.

Switch the native parser to eager materialization:

- The Rust extension constructs plain WP_Parser_Node instances at
  parse() time, recursing through the arena to build a complete
  children array up front. Done in the previous commit by updating
  the Rust create_php_node_with_classes() to write the rule_id,
  rule_name, and children properties directly.

- Drop the wp_sqlite_mysql_native_ast_* lazy-access exports and the
  arena-keyed wrapper registry from the Rust extension - the eager
  tree no longer needs them.

- Remove the WP_MySQL_Native_Parser_Node class and the two PHPUnit
  test files that exercised the wrapper-identity / cycle-collection
  invariants of the lazy implementation. Stable child identity now
  follows from PHP's normal object semantics on the eagerly built
  array. The verifier script gets the same instanceof relaxation
  (WP_Parser_Node, not the removed subclass).

WP_Parser_Node stays 'final', the native and pure-PHP parsers
produce indistinguishable ASTs, and 'instanceof WP_Parser_Node'
checks throughout the codebase keep working without changes.
Nothing extends WP_Parser_Node. Marking it final lets PHP's opcache
and tracing JIT specialize property access and method dispatch since
the class layout is now fixed. Small but consistent improvement
measured across multiple runs under tracing JIT (~+2% avg, ~+2% best).

End-to-end parser benchmark:
  tracing JIT: ~57K -> ~57-58K QPS avg, 60-61K QPS best
  no JIT:      ~33K -> ~34K QPS avg, 35K QPS best
Note that WP_MySQL_Token intentionally bypasses parent::__construct()
for the hot path and must keep its field assignments in sync with
WP_Parser_Token, and that remaining_tokens() deliberately inlines the
next_token() tokenizer step and must stay in sync with it.
Cover epsilon stripping, single-branch fragment inlining (including
cyclic-fragment termination), per-token branch selectors with FIRST/
NULLABLE propagation, single-candidate classification, and the
merge_sorted helper. Add an invariant check over the real MySQL grammar
that no branch retains an epsilon marker and that every single-candidate
rule maps each token to exactly one branch sequence.
Re-measure the documented lexer/parser benchmarks on this branch (PHP 8.5.5,
current extension build) and replace the stale trunk/PHP-8.4.5 figures.

The parser native row drops from 108,354 QPS (15.45x) to 58,111 QPS (2.00x):
trunk's native parser returned a lazy wrapper, so the parse-only benchmark
never built the tree. This branch materializes the full WP_Parser_Node tree
eagerly, so the number now reflects producing a complete AST. The lexer pure-PHP
row rises (71,553 -> 178,409 QPS) thanks to the lexer optimizations on this branch,
narrowing the native lexer speedup to 2.00x.

Note the default-CLI (no JIT) methodology and that under opcache + tracing JIT
the native edge narrows further (lexer ~1.08x, parser ~1.13x).
The PHP bridge now exports the parser grammar as per-token branch selectors
(`branches_for_token` / `nullable_branches`) instead of the previous coarse
`lookahead_is_match_possible` table - a backward-incompatible change to the ABI
shared between the extension binary and the PHP driver.

Until now load.php selected the native lexer/parser purely on class existence,
so an extension built against a different grammar ABI - most commonly a plugin
update that outpaces the installed binary - would be selected and then fatal
during native parser construction, with no fallback.

Track grammar-ABI compatibility by the extension's minor version (the 0.x line)
and bump it to 0.2.0 for this change. Gate native selection on
`phpversion( 'wp_mysql_parser' )` falling within the supported line (0.2.x); the
native lexer and parser are a matched pair (the native lexer emits a token stream
only the native parser can consume), so select both or neither. An unsupported or
absent version falls back cleanly to pure PHP, erring on the safe side for unknown
binaries.

Document the versioning contract in the extension README and add a unit test
covering the gate's boundaries.
@JanJakes JanJakes force-pushed the performance branch 4 times, most recently from 77e16a4 to 77b0f89 Compare June 5, 2026 13:32
JanJakes added 2 commits June 5, 2026 17:05
…tors

The grammar is rebuilt on every request (PHP's shared-nothing model resets the
static cache between requests), and that build dominated the lex+parse pipeline.
Cut it from ~40 ms to ~6.6 ms for a typical request, with parsing unchanged:

- Replace the naive iterate-to-fixpoint FIRST/NULLABLE computation with a
  worklist that recomputes a rule only when a rule it references grows, plus a
  C-level array union. ~40 ms -> ~18 ms; the grammar output is byte-identical.

- Denormalize the per-token branch selectors lazily, per rule, on first descent
  (ensure_rule_selector) instead of eagerly for all ~1,900 rules. A typical
  request touches ~7% of rules, so the build drops to ~6.6 ms. The parser
  materializes a rule's selector on a lookup miss, keeping the common hit path a
  single array access (warm parse throughput within ~1% of before).

- branches_for_token / single_candidate_rules are now lazily populated;
  build_all_selectors() forces a full build for consumers that read the table
  directly (the grammar tests).

- Export the eager per-rule FIRST sets to the native parser instead of the
  lazily-built per-token table. The native parser only needs FIRST sets (it
  builds its own candidates from rules), so it skips the PHP denormalization
  entirely and no longer depends on a forced full build.

- Reuse one parser across the parser benchmark corpus (resetting tokens),
  mirroring the driver, and refresh the published native-extension numbers.
Merge the "PHPUnit Tests" (pure-PHP) and "MySQL Parser Extension Tests"
workflows into a single "PHPUnit Tests" matrix that runs the mysql-on-sqlite
suite with and without the native Rust parser extension: pure on PHP 7.2-8.5,
plus the extension on PHP 8.0+ (its minimum). Job names read
"PHP 8.2 / SQLite 3.45.1" and "PHP 8.2 + ext-wp-mysql-parser / SQLite 3.45.1".
This drops the redundant pure-on-extension jobs (the old extension workflow
re-ran the plain suite on 7.2-7.4, duplicating "PHPUnit Tests") and removes the
reusable phpunit-tests-run.yml. The native jobs build the extension in release
mode (cargo build --release) so the suite exercises it at realistic speed
rather than the slow debug build.

All setup-php steps now pass `coverage: none`. setup-php enables Xdebug by
default, and the old pure-suite path left it on, instrumenting every call and
running the suite ~4x slower (PHP 7.3: ~59s -> ~14s) while no coverage report
was ever produced or consumed. Also set `coverage: none` on the MySQL Proxy
and release-publish PHP setups.

The merged workflow is path-filtered to the parser/driver/extension packages
(plus root composer) like the extension workflow was, and triggers on push to
trunk (the old phpunit-tests trigger referenced a non-existent "main" branch).
The native matrix jobs compile the extension with `cargo build --release`,
which rebuilds the whole dependency tree from scratch each run. Add
Swatinem/rust-cache for the parser-extension workspace so the cargo registry
and target dir are cached across runs, cutting the release-compile time on warm
runs without affecting the (now realistic) test-step timings.
@JanJakes JanJakes marked this pull request as draft June 6, 2026 12:01
JanJakes added a commit that referenced this pull request Jun 6, 2026
Consolidates the parser/lexer performance experiments explored alongside the shipped optimizations (PR #378, built on #373/#375/#376). Adds the parse-only harness used to re-measure them and a top-level index.
JanJakes added a commit that referenced this pull request Jun 6, 2026
Consolidates the parser/lexer performance experiments explored alongside the shipped optimizations (PR #378, built on #373/#375/#376). One directory and commit per approach; each has code and/or a NOTES.md with idea, method, result, verdict.
JanJakes added a commit that referenced this pull request Jun 6, 2026
The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.
JanJakes added a commit that referenced this pull request Jun 6, 2026
Consolidates the parser/lexer performance experiments explored alongside the shipped optimizations (PR #378, built on #373/#375/#376). One directory and commit per approach, grouped by technique family; each has code and/or a NOTES.md with idea, method, result, verdict.
JanJakes added a commit that referenced this pull request Jun 6, 2026
The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.
JanJakes added a commit that referenced this pull request Jun 6, 2026
Consolidates the parser/lexer performance experiments explored alongside the shipped optimizations (PR #378, built on #373/#375/#376). One directory and commit per approach; each has code and/or a NOTES.md with idea, method, result, verdict.
JanJakes added a commit that referenced this pull request Jun 6, 2026
The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants