Question 1

Why does AI crawler indexability matter?

Accepted Answer

If your site blocks GPTBot, PerplexityBot, or ClaudeBot in robots.txt, those engines cannot fetch your pages and cannot cite you in AI answers. Google-Extended controls whether Google can use your content to train Gemini and generate AI Overviews. Blocking these bots is a common, invisible cause of zero AI visibility.

Question 2

What's the difference between robots.txt and a live block?

Accepted Answer

robots.txt is a polite request — well-behaved bots respect it. A live block (403, 401, 429, or a Cloudflare / WAF challenge) rejects the bot at the network layer regardless of robots.txt. This tool checks both: the robots rule for each user agent and the live HTTP response when we identify as that bot.

Question 3

Should I allow all AI crawlers?

Accepted Answer

For most brands that want AI visibility — yes. The exception is training-only crawlers (GPTBot, ClaudeBot, Google-Extended, CCBot) if you have a strategic reason to withhold your content from LLM training while still being cited live. Live-fetch bots (ChatGPT-User, Perplexity-User, Claude-Web, OAI-SearchBot) are the ones that produce citations, so those should almost always be allowed.

Question 4

The tool shows 'blocked' but I never added a rule. Why?

Accepted Answer

Common causes: (1) a Cloudflare 'Block AI Scrapers' rule set at the edge, (2) a WAF challenge that returns 403 to unknown user agents, (3) a Vercel / Netlify header rule, or (4) a security plugin on WordPress. Check your CDN or firewall before editing robots.txt.

Is Your Site Blocking ChatGPT, Perplexity & Claude?

About AI crawler indexability

Why does AI crawler indexability matter?

What's the difference between robots.txt and a live block?

Should I allow all AI crawlers?

The tool shows 'blocked' but I never added a rule. Why?