robots.txt present at root

A non-empty /robots.txt is reachable at the site root, giving the merchant a control surface for every crawler. robots.txt is the only place a merchant can declare per-crawler rules and a Sitemap to AI agents.

What this check looks for

Per RFC 9309 §2.2.3, every crawler attempts a GET on `/robots.txt` before anything else. When no file is reachable, well-behaved crawlers default to permissive behavior — so a missing robots.txt does not block agents, but it does mean the merchant has no per-UA control surface (can't allow OAI-SearchBot explicitly, can't declare a Sitemap, can't block a misbehaving crawler). v2 reflects this by dropping severity from CRITICAL to HIGH: missing robots.txt costs control, not visibility.

Which AI surfaces it affects

Google AI Mode (UCP)70
ChatGPT (ACP)60
Perplexity60
Microsoft Copilot60
Meta AI50

Weighted against the live specs — ACP 2026-04-17, UCP 2026-04-08.

How to fix it

Publish a non-empty robots.txt at the site root

Shopify

A few minutes

Shopify serves a default robots.txt automatically. If yours is unreachable, confirm your store is not in password-protected (preview) mode — preview stores intentionally hide robots.txt.
To customize, create `templates/robots.txt.liquid` via Online Store → Themes → Edit code → Add a new template → robots.txt.
Start from Shopify's default template, then add a `Sitemap:` line and any per-UA rules you need.

Platform docs ↗

BigCommerce

A few minutes

BigCommerce admin → Storefront → SEO → robots.txt.
Confirm the file is enabled and contains at least a `User-agent: *` block with `Allow: /`.
Add a `Sitemap: https://yourdomain.com/xmlsitemap.php` line.

Platform docs ↗

WooCommerce

A few minutes

Install Yoast SEO or All in One SEO; both publish a robots.txt automatically.
Yoast: SEO → Tools → File editor → robots.txt.
All in One SEO: All in One SEO → Tools → Robots.txt Editor.
Confirm the published file includes a `User-agent: *` group and a `Sitemap:` line.

Platform docs ↗

Custom / headless

A few minutes

Serve a static file at `/robots.txt` with `Content-Type: text/plain` returning HTTP 200.
Include a baseline `User-agent: *` group with `Allow: /` and a `Sitemap:` line.

User-agent: *
Allow: /

Sitemap: https://yourdomain.com/sitemap.xml

The spec it's pinned to

RFC 9309 §2.2.3 — Access method (robots.txt)
RFC 9309 (Robots Exclusion Protocol) §2.2.3 defines how crawlers locate robots.txt at the site root. If no robots.txt is reachable, crawlers default to permissive behavior — so the file is a control surface for the merchant, not a hard gate.

RFC 9309 — Robots Exclusion Protocol ↗

Does your store pass this check?

Run the full audit — 82 checks across five AI shopping surfaces. Most tools only check whether you get mentioned; we check whether an agent can buy from you.

Related discovery checks

← All 82 checks