ByteDance · Training crawler
Bytespider
Bytespider is ByteDance's web crawler, used to gather data for its AI products such as the Doubao assistant. It has a contested reputation: multiple reports through 2024–2025 described Bytespider crawling aggressively and disregarding robots.txt directives. Because its compliance has been inconsistent, treat a robots.txt block as a first step and confirm enforcement with server logs or a WAF rule if you need to guarantee it.
Last updated
- User-agent token
- Bytespider
- Operator
- ByteDance
- Feeds
- ByteDance / Doubao AI training
- robots.txt
- Unverified
Historically reported to crawl aggressively and disregard robots.txt; behavior has been inconsistent, so verify with server logs.
How to control Bytespider in robots.txt
Edit the robots.txt file at the root of your domain (for example https://example.com/robots.txt), add one of the groups below, then save and re-deploy.
Remember: a named User-agent: Bytespider group overrides your global User-agent: * rules, so repeat any private Disallow paths inside it.
Allow Bytespider (recommended for AI visibility)
# Welcome Bytespider, but keep private areas blocked.
# A named user-agent group overrides "User-agent: *", so repeat
# your own private Disallow rules inside this group.
User-agent: Bytespider
Allow: /
Disallow: /admin/
Disallow: /account/
Disallow: /cart/
Disallow: /checkout/
Block Bytespider
# Block Bytespider from the entire site.
User-agent: Bytespider
Disallow: /
FAQ
Does Bytespider obey robots.txt?
It is unverified. ByteDance has indicated Bytespider should respect robots.txt, but independent reports have documented it ignoring those rules, so don't rely on robots.txt alone — verify in your logs.
How do I reliably block Bytespider?
Add a robots.txt disallow as the baseline, then enforce it at the server or WAF level (by user agent or IP) if you need a hard guarantee, since its robots.txt compliance has been unreliable.
Is your site visible to AI crawlers?
Run a free AI-visibility audit to see which AI crawlers can reach your content and how often you get cited by ChatGPT, Perplexity, Claude, and Google AI Overviews.
Run a free auditPart of the Cite Hustle AI crawler directory. For the full framework on AI search visibility, read the GEO methodology.