CITEHUSTLE
Log in Get started

ByteDance · Training crawler

Bytespider

Bytespider is ByteDance's web crawler, used to gather data for its AI products such as the Doubao assistant. It has a contested reputation: multiple reports through 2024–2025 described Bytespider crawling aggressively and disregarding robots.txt directives. Because its compliance has been inconsistent, treat a robots.txt block as a first step and confirm enforcement with server logs or a WAF rule if you need to guarantee it.

Last updated

User-agent token
Bytespider
Operator
ByteDance
Feeds
ByteDance / Doubao AI training
robots.txt
Unverified

Historically reported to crawl aggressively and disregard robots.txt; behavior has been inconsistent, so verify with server logs.

How to control Bytespider in robots.txt

Edit the robots.txt file at the root of your domain (for example https://example.com/robots.txt), add one of the groups below, then save and re-deploy. Remember: a named User-agent: Bytespider group overrides your global User-agent: * rules, so repeat any private Disallow paths inside it.

Allow Bytespider (recommended for AI visibility)

# Welcome Bytespider, but keep private areas blocked.
# A named user-agent group overrides "User-agent: *", so repeat
# your own private Disallow rules inside this group.
User-agent: Bytespider
Allow: /
Disallow: /admin/
Disallow: /account/
Disallow: /cart/
Disallow: /checkout/

Block Bytespider

# Block Bytespider from the entire site.
User-agent: Bytespider
Disallow: /

FAQ

Does Bytespider obey robots.txt?

It is unverified. ByteDance has indicated Bytespider should respect robots.txt, but independent reports have documented it ignoring those rules, so don't rely on robots.txt alone — verify in your logs.

How do I reliably block Bytespider?

Add a robots.txt disallow as the baseline, then enforce it at the server or WAF level (by user agent or IP) if you need a hard guarantee, since its robots.txt compliance has been unreliable.

Is your site visible to AI crawlers?

Run a free AI-visibility audit to see which AI crawlers can reach your content and how often you get cited by ChatGPT, Perplexity, Claude, and Google AI Overviews.

Run a free audit

Part of the Cite Hustle AI crawler directory. For the full framework on AI search visibility, read the GEO methodology.