Add native parser scalar APIs, compiled matcher, and benchmarks#427
Draft
adamziel wants to merge 16 commits into
Draft
Add native parser scalar APIs, compiled matcher, and benchmarks#427adamziel wants to merge 16 commits into
adamziel wants to merge 16 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What it does
Adds native parser and SQLite execution paths that make the cost of PHP materialization explicit.
The PR now covers three API shapes:
Example packed execution API:
The packed row encoding is:
uint32little-endian byte length + raw bytesNULL: length0xffffffffRationale
The first native parser benchmark was misleading because it did not force AST nodes to be materialized. This PR separates the real questions:
Current parser benchmark medians over the MySQL server-suite corpus, after excluding 9 known parser gaps shared by PHP and native:
Current packed SQLite benchmark, 50,000 timed iterations over a 1,000-row file DB, with
SQL_CALC_FOUND_ROWSand result rows returned as one packed binary string:Best-case packed SQLite benchmark, plain
SELECTwithout the extraSQL_CALC_FOUND_ROWScount query:The remaining native cost in the packed result path is mostly SQLite itself:
sqlite3_step, per-cellsqlite3_column_*access, and copying/checksumming bytes into the packed buffer. There is no per-row PHP object materialization on this path.Implementation
Adds native parser APIs for consuming AST data without building PHP objects:
Adds a generated
compiled_packed_id_parser.rsfast path for the packedkind/idstats benchmark. It is generated from the checked-in MySQL grammar.Adds
WP_SQLite_Native_Connection,WP_SQLite_Native_Statement, andWP_SQLite_Native_Packed_Resultbacked byrusqlitefor experimental direct SQLite execution.Supported native SQLite cases are intentionally narrow:
SQL_CALC_FOUND_ROWSby issuing the result query plus a native count querySELECT FOUND_ROWS()and session SQL mode helpers where neededThe packed result path now avoids two previously visible costs:
itoa::Bufferinstead of allocating a RustStringtakePackedRows()moves the packed buffer into PHP for one-shot consumers instead of cloning it firstTesting instructions
Build the extension:
Run focused checks:
Run the full test suite with and without the extension:
Re-run the packed SQLite benchmark: