feat: add robots.txt allowing search engines and AI crawlers by TaprootFreak · Pull Request #134 · DFXswiss/docs

TaprootFreak · 2026-06-04T17:23:18Z

Add repo-controlled robots.txt that allows search engines and AI crawlers

This adds a version-controlled robots.txt for the public DFX documentation
(docs.dfx.swiss) that explicitly allows both search engines and AI agents
to crawl, index, and learn from the content.

What

New file: src/.vuepress/public/robots.txt
VuePress copies everything in .vuepress/public/ verbatim to the published
site root, so it is served at https://docs.dfx.swiss/robots.txt.
Grants all content signals — search=yes, ai-input=yes, ai-train=yes — and
lists the major AI crawlers individually (ClaudeBot, GPTBot, Google-Extended,
CCBot, Bytespider, Amazonbot, Applebot-Extended, meta-externalagent) in
addition to the wildcard group, because some bots honor only their own named
record.

Why

The documentation is public and we want it discoverable by search and usable as
input for AI agents / RAG and training. Keeping the policy in the repo makes it
authoritative and reviewable.

No Sitemap line

The site does not serve or generate a sitemap: GET /sitemap.xml returns a
soft-404 (text/html, the SPA 404.html), and the VuePress build output
contains no sitemap.xml. A Sitemap: directive was therefore intentionally
omitted.

Verification

Built locally with npm run build; confirmed robots.txt lands at the
published root (dist/robots.txt) with identical content.

Note on activation

This file becomes the effective crawl policy only once the site's hosting-level
"Manage robots.txt / Block AI bots" toggle is disabled at the CDN, so that the
repo-served file is what visitors and crawlers receive.

Add a version-controlled robots.txt served from the VuePress public dir (copied to the published site root) that explicitly allows search engines and AI agents to crawl, index, and learn from the public documentation. It grants all content signals (search, ai-input, ai-train) and lists the major AI crawlers individually in addition to the wildcard group, since some honor only their own named record.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add robots.txt allowing search engines and AI crawlers#134

feat: add robots.txt allowing search engines and AI crawlers#134
TaprootFreak wants to merge 1 commit into
developfrom
feat/robots-allow-ai-crawlers

TaprootFreak commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TaprootFreak commented Jun 4, 2026

Add repo-controlled robots.txt that allows search engines and AI crawlers

What

Why

No Sitemap line

Verification

Note on activation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant