Glossary
AI Crawler Access
Whether AI engines' web crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) are allowed to fetch your content via robots.txt.
AI engines use distinct user-agents for their training and live-retrieval crawlers. If your robots.txt disallows them, your content cannot be cited — even if it's the best answer to a user's question.
Which AI crawler user-agents should you allow?
At minimum: GPTBot (OpenAI training), ChatGPT-User (OpenAI live retrieval), OAI-SearchBot (ChatGPT Search), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot and Perplexity-User (Perplexity), Google-Extended (Gemini), Applebot-Extended, Amazonbot, and Meta-ExternalAgent.
How do you confirm crawler access?
Inspect your /robots.txt and look for per-user-agent Disallow rules. Cite Hustle's audit feature also tests every relevant user-agent and reports which are allowed or blocked.
Part of the Cite Hustle GEO glossary — definitions for generative engine optimization and AI search.