fix: run ANALYZE at startup and use real relpages for stats by veksen · Pull Request #100 · Query-Doctor/analyzer

veksen · 2026-03-27T10:36:35Z

Summary

Replace fromAssumption(reltuples=10000, relpages=1) with a fromStatisticsExport mode that reads real relpages from pg_class — PostgreSQL's planner ignores pg_class.relpages and reads actual disk pages via RelationGetNumberOfBlocks(), then estimates tuples = actual_pages × reltuples ÷ relpages. With relpages=1 this inflated estimates by up to 74x (740,000 instead of 10,000)
When no statisticsPath is provided (CI default), run ANALYZE first so pg_class.relpages and pg_statistic reflect the current data deterministically. Skipped entirely when users provide their own stats export
Extracted buildStatsFromDatabase with 9 integration tests proving the planner estimates exactly 10,000 rows regardless of actual data (1, 10K, or 50K rows seeded)

Test plan

9 integration tests against local PostgreSQL covering:
- Planner estimates 10,000 rows with 1 / 10K / 50K rows seeded
- Bug reproduction: fromAssumption(relpages=1) produces 740,000 estimate
- relpages clamped to ≥1 for empty tables
- Indexes grouped by parent table
- columns: null preserves ANALYZE's pg_statistic entries

🤖 Generated with Claude Code

github-actions

Query Doctor Analysis

View full run details

32 queries analyzed

2 pre-existing issues

SELECT "guests"."id", "guests"."session_id", "guests"."username", "guests"."avatar_path", "guests"."color", "guests"."side", "guests"."audio_recording_path", "guests"."audio_recording_public", "gue...
index assets(event_id, uploader_id, inserted_at desc)
cost 15,922 → 1,639 (90% reduction)
SELECT * FROM guest_ip_addresses WHERE ip_address = '127.0.0.1';
index guest_ip_addresses(ip_address)
cost 126 → 8 (94% reduction)

PostgreSQL's planner ignores pg_class.relpages for tables with data — it reads actual disk pages via RelationGetNumberOfBlocks(). The old fromAssumption(reltuples=10000, relpages=1) caused the planner to estimate tuples as actual_pages × 10000 / 1, inflating row estimates by up to 74x (e.g. 740,000 instead of 10,000 for a 10K-row table). Fix: run ANALYZE before reading statistics to populate pg_statistic deterministically, then build a fromStatisticsExport mode that pairs reltuples=10,000 with the real relpages from pg_class. This makes the planner formula produce exactly 10,000 regardless of actual data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move the ANALYZE call inside the else branch so it only runs when buildStatsFromDatabase needs accurate pg_class.relpages. When users provide their own stats export, ANALYZE is skipped entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Temporary commit to verify actual relpages and estimated rows values in CI. Will be reverted after capturing the numbers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions bot reviewed Mar 27, 2026

View reviewed changes

veksen force-pushed the veksen/analyze-at-startup branch 2 times, most recently from 16df9bc to ae07ed2 Compare March 27, 2026 10:47

veksen and others added 2 commits March 27, 2026 15:03

veksen force-pushed the veksen/analyze-at-startup branch from ae07ed2 to dd00b83 Compare March 27, 2026 11:03

veksen requested review from ChrisHarris2012 and Xetera March 27, 2026 11:08

veksen force-pushed the veksen/analyze-at-startup branch 2 times, most recently from a07c52d to 73875de Compare March 27, 2026 13:40

chore: add logging to tests to capture actual pg_class values

f19f9c8

Temporary commit to verify actual relpages and estimated rows values in CI. Will be reverted after capturing the numbers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

veksen force-pushed the veksen/analyze-at-startup branch from 73875de to f19f9c8 Compare March 27, 2026 13:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: run ANALYZE at startup and use real relpages for stats#100

fix: run ANALYZE at startup and use real relpages for stats#100
veksen wants to merge 3 commits intomainfrom
veksen/analyze-at-startup

veksen commented Mar 27, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

veksen commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Query Doctor Analysis

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

veksen commented Mar 27, 2026 •

edited

Loading

github-actions bot left a comment •

edited

Loading