Skip to content

perf(docs): pre-render API reference HTML in Python#1391

Merged
hkad98 merged 5 commits intogooddata:masterfrom
hkad98:jkd/docs-prerender-api-ref
Mar 6, 2026
Merged

perf(docs): pre-render API reference HTML in Python#1391
hkad98 merged 5 commits intogooddata:masterfrom
hkad98:jkd/docs-prerender-api-ref

Conversation

@hkad98
Copy link
Contributor

@hkad98 hkad98 commented Mar 5, 2026

Summary

  • Move API reference HTML rendering from Hugo shortcodes into python_ref_builder.py using Jinja2 templates and a pre-compiled LinkResolver with O(1) dict lookups, eliminating the O(n²) shortcode processing that took ~8 minutes per build
  • Add 33 unit tests for the ref builder covering link resolution, template rendering, context builders, and file structure creation
  • Fix generate.sh crash on missing links.json, narrow V2 cache key to exclude test files, fix Dockerfile COPY paths and type annotations
  • Document the three documentation deployment workflows in scripts/docs/README.md

Changes

Core optimization (python_ref_builder.py):

  • Two-pass generation: pass 1 builds a links dict, pass 2 renders HTML via Jinja2 templates
  • Markdown templates now receive pre-rendered CONTENT instead of {{< api-ref >}} shortcodes
  • jinja2 added to script-requirements.txt

Jinja2 templates (scripts/docs/templates/):

  • object_partial.html.j2, class.html.j2, function.html.j2, module.html.j2 replicate exact Hugo shortcode output

CI/Dockerfile:

  • V2 workflow: use master's scripts for all versions, hash-based cache key scoped to *.py + templates/**
  • Dockerfile: fix COPY paths for SDK packages, persist Go modules in image layer, remove dead links.json copy
  • generate.sh: guard links.json sed with if [ -f ... ]
  • Add docs-scripts-tests job to pre-merge pipeline, scripts/docs/** in path filter
  • Add make test-docs-scripts target

Result: Hugo build drops from ~405s to ~100s (API ref template time eliminated entirely)

@hkad98 hkad98 requested review from jaceksan, lupko and pcerny as code owners March 5, 2026 14:02
hkad98 added 3 commits March 5, 2026 15:05
…hortcodes

Hugo was spending ~8 minutes processing ~4,140 API ref shortcodes per version
(O(n²) regex in api-ref-link-all-partial). This moves HTML rendering into
python_ref_builder.py using Jinja2 templates and a pre-compiled LinkResolver
with O(1) dict lookups, eliminating all shortcode processing time.

Key changes:
- Two-pass generation: pass 1 builds links dict, pass 2 renders HTML
- Jinja2 templates replicate the exact output of Hugo shortcodes/partials
- Markdown templates now receive pre-rendered CONTENT instead of shortcodes
- jinja2 added to script-requirements.txt
- CI workflow: use master's scripts for all versions, hash-based cache key
- Dockerfile: fix COPY paths, persist Go modules in image layer, remove
  unnecessary data.json/links.json copies to versioned_docs
- Restore api-reference/_index.md and pandas/_index.md for Hugo navigation

Result: ~3x faster Hugo build (57s → 19s per version), API ref template
time eliminated entirely.

jira: trivial
risk: high
Add 33 tests covering LinkResolver, template rendering, context builders, and file structure creation. Fix pre-existing type annotation issues (RefHolder.packages, parse_toml return type) and remove dead links.json generation that is no longer consumed after the pre-render change. Add Makefile test-docs-scripts target and CI docs-scripts-tests job with scripts/docs/** path filter.

JIRA: trivial
risk: nonprod
Fix two bugs found during review: (1) generate.sh crashes on missing links.json after the pre-render change removed its generation — add if-guard around the sed/mv. (2) V2 cache key included test files via scripts/docs/** glob, causing unnecessary cache busts — narrow to scripts/docs/*.py + templates/**. Also add scripts/docs/README.md documenting how the three documentation workflows operate.

JIRA: trivial
risk: high
@hkad98 hkad98 force-pushed the jkd/docs-prerender-api-ref branch from 78c5811 to 6784c8a Compare March 5, 2026 14:06
uv sync --locked does not install pytest since it lives in the test dependency group. Changed to --group test to make pytest available.

JIRA: trivial
risk: nonprod
@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.53%. Comparing base (9659a3e) to head (dd6fc46).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1391      +/-   ##
==========================================
- Coverage   77.53%   77.53%   -0.01%     
==========================================
  Files         225      225              
  Lines       14615    14614       -1     
==========================================
- Hits        11332    11331       -1     
  Misses       3283     3283              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

python_ref_builder imports toml which is not a transitive dependency of any workspace package. Adding it to the test group so the docs-scripts-tests CI job can find it.

JIRA: trivial
risk: nonprod
@hkad98 hkad98 force-pushed the jkd/docs-prerender-api-ref branch from 099652f to dd6fc46 Compare March 6, 2026 11:50
@hkad98 hkad98 requested a review from Mara3l March 6, 2026 13:41
@hkad98 hkad98 merged commit 57d47dd into gooddata:master Mar 6, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants