Skip to content

Use native-backed AST nodes in the Rust parser#423

Draft
adamziel wants to merge 1 commit into
performancefrom
native-direct-ast-optimization
Draft

Use native-backed AST nodes in the Rust parser#423
adamziel wants to merge 1 commit into
performancefrom
native-direct-ast-optimization

Conversation

@adamziel

@adamziel adamziel commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

What changed

  • Return native-backed WP_MySQL_Native_Parser_Node wrappers from the Rust parser instead of eagerly materializing the whole AST into PHP objects.
  • Delegate AST read methods to Rust-owned arena accessors, while preserving stable PHP node identity and materializing a node on first mutation.
  • Speed up native lexing by replacing linear keyword/version/synonym scans with lazy static lookup maps and avoiding unnecessary identifier copies/uppercase allocation.
  • Use a hybrid FIRST-set representation in the native parser.
  • Replace the native AST wrapper cache HashMap with a dense Vec<Option<usize>> keyed by arena node index.
  • Add --consume=none|descendants to run-parser-benchmark.php so lazy AST parser-only numbers can be compared with a forced full-tree traversal.

Benchmarks

Local PHP 8.4.5, 69,577 MySQL server-suite queries, 3 reps each.

benchmark pure PHP avg native avg speedup
lexer-only, no JIT 107,823 QPS 1,139,760 QPS 10.6×
lexer-only, JIT 222,725 QPS 1,177,436 QPS 5.3×
parser-only, no JIT (--consume=none) 19,348 QPS 172,980 QPS 8.9×
parser-only, JIT (--consume=none) 41,706 QPS 181,311 QPS 4.3×
full AST walk, no JIT (--consume=descendants) 13,367 QPS 55,749 QPS 4.2×
full AST walk, JIT (--consume=descendants) 21,512 QPS 58,004 QPS 2.7×

--consume=descendants is a better benchmark than parser-only for lazy AST behavior because it forces PHP to touch all 4,786,376 descendants. It is still somewhat pessimistic for the SQLite translator because the translator usually reads selected subtrees rather than calling get_descendants() on the root.

Profiling notes

  • After the keyword lookup changes, keyword scanning disappears from the parser-only sample; the remaining hot path is recursive parser control flow plus allocator churn.
  • In the full AST walk sample, time shifts to PHP boundary costs: object creation, zend_update_property, property/hash lookups, zval arrays, and object destruction.
  • The dense wrapper cache specifically targets the visible hashing/re-hashing cost during full AST walks.

Verification

  • cargo fmt --check
  • release build with the local PHP config
  • php -l tests/tools/run-parser-benchmark.php
  • php -l src/mysql/native/class-wp-mysql-native-parser-node.php
  • php -d extension="$EXT" tests/tools/verify-native-parser-extension.php
  • php -d extension="$EXT" vendor/bin/phpunit tests/mysql/native/WP_MySQL_Parser_Instanceof_Tests.php
  • php -d extension="$EXT" vendor/bin/phpunit tests/mysql/WP_MySQL_Server_Suite_Parser_Tests.php
  • php -d extension="$EXT" vendor/bin/phpunit tests/WP_SQLite_Driver_Translation_Tests.php

Caveat

This makes WP_Parser_Node non-final so the native facade can subclass it. That keeps the pure-PHP parser methods free of native checks, but may slightly reduce pure-PHP/JIT parser performance. A final-preserving native-handle design is possible, but would be more invasive.

JanJakes added a commit that referenced this pull request Jun 6, 2026
The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.
JanJakes added a commit that referenced this pull request Jun 6, 2026
The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.
JanJakes added a commit that referenced this pull request Jun 6, 2026
The one native path actually shipped (PR #381/#423/#378). ~1.33x over optimized PHP under JIT (the original ~10x conflated lazy-AST + no-JIT). Code lives on those branches; documented here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant