Robots.txt Generator

New

Build a valid robots.txt using visual controls. User-agent presets, Disallow/Allow lists, Crawl-delay and Sitemap support. Live preview + validation.

Last updated June 2025 4 min read Works in browser Privacy first

Robots.txt Generator

Pick a preset, tweak the rules, ship a valid robots.txt.

Group 1
Live preview
# robots.txt generated by ToolMint — https://tool-mint.com

User-agent: *
Allow: 
Validation
00
  • Looks good — no issues found.
Guide

What robots.txt actually does

robots.txt is a plain-text file at the root of your host (e.g. https://example.com/robots.txt) that tells crawlers which paths they may or may not fetch. It is a request, not a security fence. Anything sensitive belongs behind authentication, not a Disallow rule.

A valid robots.txt is organized into groups. Each group starts with one or more User-agent: lines and is followed by Allow: / Disallow: rules that apply to that agent.

Where the tool runs

Everything happens in your browser. Presets, rules and previews never touch a server.

Anatomy of a robots.txt

# Comment
User-agent: *
Disallow: /admin
Allow: /admin/public
Crawl-delay: 10

Sitemap: https://example.com/sitemap.xml
  • User-agent: * targets every bot.
  • Disallow: /admin blocks anything below /admin.
  • Allow: /admin/public reopens a sub-path (Allow always beats Disallow of a shorter prefix in Google's implementation).
  • Crawl-delay: 10 is honored by Bing and Yandex, mostly ignored by Google.
  • Sitemap: lines are absolute URLs and can appear anywhere in the file.

The four presets — when to use each

Preset When to use
Allow everything Default for public sites.
Block everything Staging and preview environments only.
WordPress defaults Hides /wp-admin and /wp-includes while keeping AJAX open.
Shopify defaults Blocks /admin, /cart, /checkout and duplicate-content tag URLs.
Block common AI bots GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended.

Blocking AI crawlers in 2025

All major AI companies now respect robots.txt directives. Use dedicated groups for each agent:

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: *
Allow: /

Google-Extended is Google's training crawler. Blocking it does not affect Search — a separate group (Googlebot) is used for indexing.

Five mistakes we see the most

  1. Disallowing your own assets. Blocking /wp-content/ or /static/ prevents Google from rendering your pages properly.
  2. A single trailing /*. Disallow: /* accidentally blocks the entire site.
  3. Using robots.txt for privacy. Search engines respect it; scrapers do not. Anything sensitive belongs behind auth.
  4. Forgetting the sitemap line. Always list your sitemap here — it is discovered even when nobody knows the URL.
  5. Multiple contradictory groups. Merge duplicate User-agent blocks into one — crawlers are allowed to pick only one group per file.

Common patterns

Allow only the homepage

User-agent: *
Allow: /$
Disallow: /

Block a single query parameter

User-agent: *
Disallow: /*?utm_source=

Different rules for image bots

User-agent: Googlebot-Image
Allow: /images/
Disallow: /

User-agent: *
Allow: /

After you copy the file

  1. Upload it to the root of your host (Cloudflare, Vercel, Netlify — anywhere).
  2. Test with the robots.txt Tester in Google Search Console.
  3. Re-fetch your sitemap and any newly-blocked pages in Search Console to trigger a re-crawl.
Robots.txt is not a security control

Anything you want private must be behind authentication. Disallow: /secret tells honest bots to skip that URL — but the URL itself is public.

FAQ

Where do I put robots.txt? At the top of the host. https://example.com/robots.txt. Subfolders and subdomains do not inherit — every host needs its own.

Does Google respect Crawl-delay? No. Set your crawl rate inside Search Console instead.

Can I have multiple sitemap lines? Yes. Each Sitemap: line is treated independently and can point to a sitemap index or individual sitemap.

What about case sensitivity? Rules are case-sensitive for paths but case-insensitive for User-agent and directive names.

Steps

How to use

  1. Pick a preset or start from scratch.
  2. Add User-agent groups with Disallow / Allow paths.
  3. Add sitemap URLs and copy or download robots.txt.
Why you’ll love it

Benefits

Free forever

No trials, no paywalls, no ads inside the tool.

Zero friction

No sign up, no email, no cookies you didn’t ask for.

Fast by design

Interactions render in under 200ms on modern devices.

In practice

Examples

  • Block every AI crawler while keeping Google.
  • A WordPress-safe default that hides /wp-admin.
Tips

Pro tips

  • Robots.txt is a suggestion — anything sensitive belongs behind auth, not a disallow rule.
  • Sitemap: lines can point to multiple sitemaps.
Watch out

Common mistakes to avoid

  • Blocking your own CSS/JS folders (breaks Google indexing).
  • Disallowing /* (blocks everything, silently).

Frequently asked questions

Made with care by ToolMint