perf(docs): pre-render API reference HTML in Python#1391
Merged
hkad98 merged 5 commits intogooddata:masterfrom Mar 6, 2026
Merged
perf(docs): pre-render API reference HTML in Python#1391hkad98 merged 5 commits intogooddata:masterfrom
hkad98 merged 5 commits intogooddata:masterfrom
Conversation
…hortcodes Hugo was spending ~8 minutes processing ~4,140 API ref shortcodes per version (O(n²) regex in api-ref-link-all-partial). This moves HTML rendering into python_ref_builder.py using Jinja2 templates and a pre-compiled LinkResolver with O(1) dict lookups, eliminating all shortcode processing time. Key changes: - Two-pass generation: pass 1 builds links dict, pass 2 renders HTML - Jinja2 templates replicate the exact output of Hugo shortcodes/partials - Markdown templates now receive pre-rendered CONTENT instead of shortcodes - jinja2 added to script-requirements.txt - CI workflow: use master's scripts for all versions, hash-based cache key - Dockerfile: fix COPY paths, persist Go modules in image layer, remove unnecessary data.json/links.json copies to versioned_docs - Restore api-reference/_index.md and pandas/_index.md for Hugo navigation Result: ~3x faster Hugo build (57s → 19s per version), API ref template time eliminated entirely. jira: trivial risk: high
Add 33 tests covering LinkResolver, template rendering, context builders, and file structure creation. Fix pre-existing type annotation issues (RefHolder.packages, parse_toml return type) and remove dead links.json generation that is no longer consumed after the pre-render change. Add Makefile test-docs-scripts target and CI docs-scripts-tests job with scripts/docs/** path filter. JIRA: trivial risk: nonprod
Fix two bugs found during review: (1) generate.sh crashes on missing links.json after the pre-render change removed its generation — add if-guard around the sed/mv. (2) V2 cache key included test files via scripts/docs/** glob, causing unnecessary cache busts — narrow to scripts/docs/*.py + templates/**. Also add scripts/docs/README.md documenting how the three documentation workflows operate. JIRA: trivial risk: high
78c5811 to
6784c8a
Compare
uv sync --locked does not install pytest since it lives in the test dependency group. Changed to --group test to make pytest available. JIRA: trivial risk: nonprod
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1391 +/- ##
==========================================
- Coverage 77.53% 77.53% -0.01%
==========================================
Files 225 225
Lines 14615 14614 -1
==========================================
- Hits 11332 11331 -1
Misses 3283 3283 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Mara3l
reviewed
Mar 6, 2026
Mara3l
reviewed
Mar 6, 2026
python_ref_builder imports toml which is not a transitive dependency of any workspace package. Adding it to the test group so the docs-scripts-tests CI job can find it. JIRA: trivial risk: nonprod
099652f to
dd6fc46
Compare
Mara3l
approved these changes
Mar 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
python_ref_builder.pyusing Jinja2 templates and a pre-compiledLinkResolverwith O(1) dict lookups, eliminating the O(n²) shortcode processing that took ~8 minutes per buildgenerate.shcrash on missinglinks.json, narrow V2 cache key to exclude test files, fix Dockerfile COPY paths and type annotationsscripts/docs/README.mdChanges
Core optimization (
python_ref_builder.py):CONTENTinstead of{{< api-ref >}}shortcodesjinja2added toscript-requirements.txtJinja2 templates (
scripts/docs/templates/):object_partial.html.j2,class.html.j2,function.html.j2,module.html.j2replicate exact Hugo shortcode outputCI/Dockerfile:
*.py+templates/**links.jsoncopygenerate.sh: guardlinks.jsonsed withif [ -f ... ]docs-scripts-testsjob to pre-merge pipeline,scripts/docs/**in path filtermake test-docs-scriptstargetResult: Hugo build drops from ~405s to ~100s (API ref template time eliminated entirely)