Skip to content

Add native parser scalar APIs, compiled matcher, and benchmarks#427

Draft
adamziel wants to merge 16 commits into
native-direct-ast-optimizationfrom
native-ast-handle-api
Draft

Add native parser scalar APIs, compiled matcher, and benchmarks#427
adamziel wants to merge 16 commits into
native-direct-ast-optimizationfrom
native-ast-handle-api

Conversation

@adamziel

@adamziel adamziel commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

What it does

Adds native parser and SQLite execution paths that make the cost of PHP materialization explicit.

The PR now covers three API shapes:

  1. Existing public shape: parse SQL and materialize PHP-visible AST descendants.
  2. Native scalar/parser shape: parse and consume the AST in Rust, returning compact scalar stats or packed rows instead of PHP objects.
  3. Packed SQLite result shape: execute supported SQLite hot paths in Rust and return one binary result buffer instead of one PHP array per row.

Example packed execution API:

$native = new WP_SQLite_Native_Connection( $db_path );

$result = $native->queryMysqlPackedRows(
    "SELECT SQL_CALC_FOUND_ROWS ID, post_title
     FROM wp_posts
     WHERE post_status = 'publish'
     ORDER BY ID
     LIMIT 100"
);

$found_rows  = $result->foundRows();
$row_count   = $result->rowCount();
$columns     = $result->columns();
$packed_rows = $result->takePackedRows(); // one PHP binary string

The packed row encoding is:

  • each cell: uint32 little-endian byte length + raw bytes
  • NULL: length 0xffffffff

Rationale

The first native parser benchmark was misleading because it did not force AST nodes to be materialized. This PR separates the real questions:

  • If PHP needs normal AST objects, native is faster but not dramatically faster.
  • If Rust can parse and consume the AST before crossing back to PHP, native is much faster.
  • If Rust executes SQLite but still returns normal PHP row arrays, the win mostly disappears because PDO SQLite is already native C and PHP still pays row materialization costs.
  • If Rust executes SQLite and returns one packed buffer, native pulls ahead as result size grows.

Current parser benchmark medians over the MySQL server-suite corpus, after excluding 9 known parser gaps shared by PHP and native:

Workload PHP JIT Native JIT Speedup PHP no-JIT Native no-JIT Speedup
Materialized descendants, public parser API ~21.2k QPS ~59.6k QPS ~2.8x ~11.7k QPS ~51.1k QPS ~4.4x
Batch raw-SQL packed-id stats ~35.5k QPS ~1.03M QPS ~29x ~11.0k QPS ~1.03M QPS ~93x
Batch raw-SQL packed rows + token-byte checksum ~37.0k QPS ~416k QPS ~11x ~14.6k QPS ~431k QPS ~29x

Current packed SQLite benchmark, 50,000 timed iterations over a 1,000-row file DB, with SQL_CALC_FOUND_ROWS and result rows returned as one packed binary string:

LIMIT PHP/PDO + PHP packing, JIT Native packed, JIT Speedup PHP/PDO + PHP packing, no-JIT Native packed, no-JIT Speedup
10 12.8k/s 15.0k/s 1.17x 13.1k/s 15.6k/s 1.19x
100 10.2k/s 13.2k/s 1.30x 8.0k/s 13.4k/s 1.68x
500 4.1k/s 8.7k/s 2.12x 2.3k/s 8.0k/s 3.39x

Best-case packed SQLite benchmark, plain SELECT without the extra SQL_CALC_FOUND_ROWS count query:

Scenario PHP JIT Native JIT Speedup PHP no-JIT Native no-JIT Speedup
500 rows × 2 cols 5.9k/s 16.6k/s 2.83x 2.5k/s 16.9k/s 6.65x
2000 rows × 2 cols 1.2k/s 4.6k/s 3.92x 0.7k/s 4.6k/s 6.41x
2000 rows × 8 cols 0.5k/s 2.2k/s 4.45x 0.2k/s 2.2k/s 9.65x

The remaining native cost in the packed result path is mostly SQLite itself: sqlite3_step, per-cell sqlite3_column_* access, and copying/checksumming bytes into the packed buffer. There is no per-row PHP object materialization on this path.

Implementation

Adds native parser APIs for consuming AST data without building PHP objects:

$parser = new WP_MySQL_Parser( $grammar, array() );

list( $processed, $failures, $descendants, $checksum ) =
    $parser->parse_sql_batch_native_descendant_packed_id_stats( $queries );

Adds a generated compiled_packed_id_parser.rs fast path for the packed kind/id stats benchmark. It is generated from the checked-in MySQL grammar.

Adds WP_SQLite_Native_Connection, WP_SQLite_Native_Statement, and WP_SQLite_Native_Packed_Result backed by rusqlite for experimental direct SQLite execution.

Supported native SQLite cases are intentionally narrow:

  • common WordPress and WooCommerce read-query passthroughs
  • simple update passthroughs
  • SQL_CALC_FOUND_ROWS by issuing the result query plus a native count query
  • SELECT FOUND_ROWS() and session SQL mode helpers where needed

The packed result path now avoids two previously visible costs:

  • integer cells use stack-based itoa::Buffer instead of allocating a Rust String
  • takePackedRows() moves the packed buffer into PHP for one-shot consumers instead of cloning it first

Testing instructions

Build the extension:

cd packages/php-ext-wp-mysql-parser
cargo fmt --check
RUSTFLAGS='-C link-arg=-undefined -C link-arg=dynamic_lookup' \
  PHP_CONFIG="$(command -v php-config)" \
  LIBCLANG_PATH=/opt/homebrew/opt/llvm/lib \
  cargo build --release

Run focused checks:

cd ../../
php -l packages/mysql-on-sqlite/tests/tools/run-parser-benchmark.php
php -l packages/mysql-on-sqlite/tests/tools/run-sqlite-execution-benchmark.php

./vendor/bin/phpcs \
  packages/mysql-on-sqlite/tests/tools/run-parser-benchmark.php \
  packages/mysql-on-sqlite/tests/tools/run-sqlite-execution-benchmark.php \
  packages/mysql-on-sqlite/tests/mysql/native/WP_MySQL_Parser_Instanceof_Tests.php

EXT="$PWD/packages/php-ext-wp-mysql-parser/target/release/libwp_mysql_parser.dylib"
php -d extension="$EXT" \
  packages/mysql-on-sqlite/vendor/bin/phpunit \
  -c packages/mysql-on-sqlite/phpunit.xml.dist \
  --filter WP_MySQL_Parser_Instanceof_Tests

Run the full test suite with and without the extension:

EXT="$PWD/packages/php-ext-wp-mysql-parser/target/release/libwp_mysql_parser.dylib"
php -d extension="$EXT" packages/mysql-on-sqlite/vendor/bin/phpunit -c packages/mysql-on-sqlite/phpunit.xml.dist
php packages/mysql-on-sqlite/vendor/bin/phpunit -c packages/mysql-on-sqlite/phpunit.xml.dist

Re-run the packed SQLite benchmark:

EXT="$PWD/packages/php-ext-wp-mysql-parser/target/release/libwp_mysql_parser.dylib"
php -d extension="$EXT" \
  -d opcache.enable_cli=1 \
  -d opcache.jit=tracing \
  -d opcache.jit_buffer_size=128M \
  packages/mysql-on-sqlite/tests/tools/run-sqlite-execution-benchmark.php \
  --json --iterations=50000 --warmup=5000 --rows=1000 \
  --select-limit=500 --workload=packed-select-found-rows

@adamziel adamziel changed the title Add native AST scalar row benchmarks Add native scalar row parser benchmarks Jun 7, 2026
@adamziel adamziel changed the title Add native scalar row parser benchmarks Add native parser scalar and batch stats APIs Jun 7, 2026
@adamziel adamziel changed the title Add native parser scalar and batch stats APIs Add native parser scalar, stats, and compiled matcher APIs Jun 7, 2026
@adamziel adamziel changed the title Add native parser scalar, stats, and compiled matcher APIs Add native parser scalar APIs, compiled matcher, and benchmarks Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant