CITEHUSTLE
Log in Get started

Glossary

AI Crawler Access

Whether AI engines' web crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) are allowed to fetch your content via robots.txt.

AI engines use distinct user-agents for their training and live-retrieval crawlers. If your robots.txt disallows them, your content cannot be cited — even if it's the best answer to a user's question.

Which AI crawler user-agents should you allow?

At minimum: GPTBot (OpenAI training), ChatGPT-User (OpenAI live retrieval), OAI-SearchBot (ChatGPT Search), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot and Perplexity-User (Perplexity), Google-Extended (Gemini), Applebot-Extended, Amazonbot, and Meta-ExternalAgent.

How do you confirm crawler access?

Inspect your /robots.txt and look for per-user-agent Disallow rules. Cite Hustle's audit feature also tests every relevant user-agent and reports which are allowed or blocked.

Part of the Cite Hustle GEO glossary — definitions for generative engine optimization and AI search.