Robots.txt Generator & Validator
Generate and validate robots.txt files. Block AI crawlers, control search engine access, and protect your content — all for free.
Block AI Training Crawlers
Protect your content from being used to train AI models. Block these crawlers to opt out of AI training datasets.
GPTBot
OpenAI
Used by OpenAI to crawl content for training ChatGPT models
ChatGPT-User
OpenAI
ChatGPT browsing mode — fetches pages in real time
ClaudeBot
Anthropic
Anthropic crawler for Claude AI training data
CCBot
Common Crawl
Common Crawl dataset used by many AI companies for training
Google-Extended
Controls content used to train Google Gemini AI models
FacebookBot
Meta
Meta AI crawler for training Llama and other models
Bytespider
ByteDance
ByteDance/TikTok crawler for AI training data
Applebot-Extended
Apple
Controls content used to train Apple Intelligence features
PerplexityBot
Perplexity
Perplexity AI search and answer engine crawler
Amazonbot
Amazon
Amazon crawler for Alexa AI and product training
cohere-ai
Cohere
Cohere AI model training data crawler
Live Preview
1# robots.txt generated by Hand On Web2# https://www.handonweb.com/tools/robots-txt-generator3# 2026-02-2445User-agent: *6Allow: /7
How to Use This Robots.txt Generator
- Choose which crawlers to block. Start with the AI Crawlers section to block bots like GPTBot and ClaudeBot from training on your content. Then review the Search Engine Crawlers to ensure Google, Bing, and others can access your site.
- Add custom rules. Use the Custom Rules section to block specific paths like
/admin/,/api/, or/private/from all bots or specific user-agents. - Add your sitemap URL. Enter the full URL of your XML sitemap to help search engines discover your pages more efficiently.
- Copy or download. Use the live preview on the right to review your robots.txt in real time. When you're happy, click Copy or Download and upload the file to your website's root directory.
- Validate. Switch to the Validator tab to paste and check any existing robots.txt for errors and warnings.
What is Robots.txt and Why It Matters
The robots.txt file is one of the most fundamental yet overlooked parts of technical SEO. It sits at the root of your website and acts as a gatekeeper, telling web crawlers — including Google, Bing, and AI bots — which parts of your site they can and cannot access.
A properly configured robots.txt helps you manage your crawl budget — the number of pages search engines will crawl on your site within a given timeframe. By blocking access to low-value pages like admin panels, search result pages, and staging environments, you ensure crawlers spend their time on the content that matters most for your rankings.
Without a robots.txt, crawlers will attempt to access every URL they discover, which can lead to wasted crawl budget, duplicate content issues, and even sensitive pages being exposed. For any serious website — whether a small business site or a large e-commerce platform — having a well-maintained robots.txt is essential.
Should You Block AI Crawlers?
The rise of large language models like ChatGPT, Claude, and Gemini has created a new category of web crawlers specifically designed to collect training data. Unlike traditional search engine crawlers that index your pages to show them in search results, AI crawlers harvest your content to train machine learning models — often without direct attribution or compensation.
Many website owners and publishers are choosing to block AI crawlers to protect their intellectual property. The New York Times, Reddit, and thousands of other publishers now block GPTBot and similar bots. If your website contains original content, research, or creative work, blocking AI training crawlers is a reasonable step to protect your investment.
However, it's worth noting that blocking ChatGPT-User (the browsing agent) will prevent ChatGPT from fetching your pages during live conversations, which could mean missed referral traffic. Similarly, blocking Google-Extended only affects Gemini AI training — it does not impact your Google Search rankings. Our generator makes it easy to selectively block the crawlers you want while keeping others active.
Frequently Asked Questions
Need Technical SEO Help?
Our SEO experts can audit your robots.txt, fix crawl issues, and optimise your site for search engines and AI visibility.