Robots.txt Generator

Generate a technically accurate, SEO-optimized robots.txt file for any website — platform-aware, real-time preview, completely free.

✅ Allow Rules 🚫 Disallow Rules 🤖 Bot Management ⚡ Platform-Aware

robots.txt

Score —

Lines 0

Allow 0

Disallow 0

Bots 0

robots.txt

📖 Explained

✅ Validator

robots.txt tool · Deep Rahul

⛔ Warning: You are blocking the entire site! Googlebot cannot crawl anything.

⚠️ You’re blocking CSS/JS files — Google won’t render your pages. Remove Disallow rules for .css/.js — they are accessible by default, no Allow rule needed.

💡 Sitemap URL missing. Add your sitemap.xml for better crawl coverage.

Site Details

Website URL, sitemap & basic info

▾

Website URL *Required

Your main domain. Format: https://yourdomain.com — Include https:// for best results.

Used in comments. Paste your full URL including https://

Sitemap Format Recommended

sitemap.xml

Standard — most WordPress & general sites

sitemap_index.xml

Index format — Yoast SEO, large sites

Custom URL

Enter your own sitemap path

Additional Sitemaps

One URL per line — image, video, news sitemaps

Site Type & Technology

Auto-generates smart rules based on your stack

▾

Website Purpose / Niche

CMS / Platform / Technology

⚡ Smart Rules for Platform

AUTO-SUGGESTED

These rules are specifically recommended for your selected platform. Checked = active. Uncheck any rule to remove it from your robots.txt.

✅ Allow Rules

Paths to explicitly allow — CSS/JS are allowed by default

▾

🚫 Disallow Rules

Paths to block from crawlers

▾

🤖 Bot Management

Control which crawlers can access your site

▾

⚙️ Advanced Settings

Crawl delay, options & custom paths

▾

Quick Overview – Everything You Need to Know About Robots.txt

A robots.txt file is a simple text file that lives at the root of your website – for example, https://yoursite.com/robots.txt. It tells Google and other search engine bots which pages to crawl and which pages to skip.

In this complete guide, you will learn:

✅ What a robots.txt file is and why your website needs one
✅ How to use this free generator – step by step
✅ Where and how to upload the file (WordPress, Shopify, cPanel)
✅ 6 common mistakes that hurt your SEO – and how to avoid them
✅ Robots.txt vs Noindex – which one to use and when
✅ Platform-specific rules for WordPress, WooCommerce, Shopify, and Elementor
✅ How to manage AI bots like GPTBot and ClaudeBot
✅ Full FAQ and syntax guide

✍️ Deep Rahul | SEO Professional, 8+ Years of Experience | Last Updated: April 2026

What Is a Robots.txt File?

A robots.txt file is a simple text file that sits at the root of your website. For example: https://yoursite.com/robots.txt. It tells search engine crawlers – also called bots or spiders – which pages they are allowed to visit and which ones to skip.

Think of it like a sign on your front door. When Google’s crawler arrives at your website, the very first thing it reads is your robots.txt file. It follows the instructions inside before doing anything else on your site.

The file follows a standard called the Robots Exclusion Protocol – a format that all major search engines understand. Here is a simple example of what a robots.txt file looks like:

User-agent: *
Allow: /
Disallow: /wp-admin/
Sitemap: https://yoursite.com/sitemap.xml

This tells all bots (* means everyone) to crawl the whole site, but stay away from the admin area. It also points them to your sitemap so they can find all your pages quickly. Simple – but getting it wrong can seriously hurt your SEO.

Why Does Your Website Need a Robots.txt File?

After working with hundreds of websites over 8 years, I have seen two things happen again and again. Either a site has no robots.txt file at all, or it has one with serious mistakes. Both cause real problems. Here is why every website needs a properly written robots.txt file:

1. It Controls Your Crawl Budget

Google does not crawl every page of your website every single day. It has a limited budget of time and resources for each site. If your robots.txt is sending crawlers to useless pages – like login pages, admin panels, or filter URLs – they waste that budget and miss your important content.

2. It Stops Duplicate Content Problems

Many websites create duplicate pages automatically. E-commerce sites do this with product filter pages. WordPress sites do it with tag pages, category archives, and search result pages. If crawlers index all of these, Google gets confused about which page is the real one. A good robots.txt keeps those junk pages out of the index.

3. It Protects Private Areas of Your Website

You do not want Google showing your admin dashboard, your staging environment, or your internal search results to the public. A robots.txt file blocks these pages from being crawled in the first place.

4. It Helps Your New Content Get Indexed Faster

When crawlers are not wasting time on irrelevant pages, they spend more time on your real content. This means your new blog posts and product pages get indexed faster – and start ranking sooner.

5. It Helps You Manage AI Bots

In 2025 and 2026, a new challenge has appeared – AI training bots. Tools like GPTBot from OpenAI and Google’s own AI crawlers now visit websites to collect data for training their AI models. If you do not want your content used for AI training, you can block these bots right inside your robots.txt file. This generator has a dedicated section for managing AI bots.

How to Use This Free Robots.txt Generator

I designed this tool to be simple for beginners but powerful enough for experienced SEOs. Here is exactly how it works, step by step.

Step 1: Enter Your Website URL

Type your website address – for example, https://yoursite.com. The tool will automatically fill in your sitemap URL so you do not have to do it manually.

Step 2: Choose Your Sitemap Format

Pick from sitemap.xml (the standard format used by most sites), sitemap_index.xml (used by Yoast SEO and large sites with many sitemaps), or enter a custom sitemap URL if yours is different.

Step 3: Choose Your Site Type and Platform

Select what kind of website you have – blog, e-commerce store, news site, portfolio, and more. Then pick your CMS or platform. WordPress, WooCommerce, Shopify, Magento, and 20+ other platforms are supported. The tool then automatically suggests the right rules for your specific setup.

For example: if you choose WordPress + Elementor, the tool will automatically make sure your Elementor CSS files are not blocked – a very common mistake that breaks how Google renders your pages.

Step 4: Review Allow and Disallow Rules

The tool gives you groups of commonly needed rules. You can pick which ones apply to your site. Every rule comes with a label showing its SEO impact – Recommended, Optional, or Not Recommended – with a clear explanation of why.

Step 5: Manage Bot Access

Choose which crawlers can access your site. This includes search engine bots (Googlebot, Bingbot), social media crawlers (Facebook, LinkedIn), SEO tool bots (Ahrefs, Semrush), and AI training bots (GPTBot, ClaudeBot).

Step 6: Advanced Settings (Optional)

You can set a crawl delay, block UTM parameters from being crawled, and enable a crawl budget optimizer. These settings are optional but very useful for larger websites.

Step 7: Copy Your File

The live preview on the left side updates in real time as you make your choices. When you are happy with the result, click Copy and paste the file directly into your website.

Where to Upload Your Robots.txt File

Once you have generated your robots.txt file, you need to place it in the correct location. This step is very important – if the file is in the wrong place, search engines will not find it.

✅ Correct location: https://yoursite.com/robots.txt

❌ Wrong location: https://yoursite.com/folder/robots.txt

WordPress

Go to your WordPress Admin → Settings → General. Install the free plugin Rank Math or Yoast SEO – both let you edit your robots.txt file directly from the WordPress dashboard without needing FTP. You can also access your site’s root folder via cPanel File Manager or FTP and upload the file manually.

Shopify

Go to Shopify Admin → Online Store → Themes → Actions → Edit Code. Find the robots.txt.liquid file and modify it from there. Note that Shopify controls some parts of robots.txt automatically.

General Hosting (cPanel)

After uploading, verify your file is live by visiting https://yoursite.com/robots.txt in your browser. You should see the plain text content of your file.

How to Test Your Robots.txt File After Uploading

Uploading is not the last step. You should always test your robots.txt file to make sure it works correctly and has not accidentally blocked any important pages.

Method 1: Google Search Console

This is the most reliable method. Go to Google Search Console → Settings → robots.txt report. Google will show you any errors or warnings it finds in your file. You can also test specific URLs to see if they are blocked or allowed.

Method 2: Direct URL Check

Visit https://yoursite.com/robots.txt in your browser. You should see the plain text content of your file. If you see a 404 error, the file is missing or in the wrong folder.

Method 3: Third-Party Validators

Use Google’s robots.txt tester – you can access it directly from the Test button inside this generator. Paste your file and check specific URLs against the rules.

6 Common Robots.txt Mistakes I See All the Time

In 8 years of SEO consulting, I have audited hundreds of robots.txt files. Here are the mistakes I see most often – and what to do instead.

Mistake 1: Blocking CSS and JavaScript Files

This was common advice years ago. People blocked /wp-content/ to save crawl budget. The problem is that Google needs your CSS and JavaScript files to properly render your pages. If it cannot access these files, it will not understand how your pages look – and that can hurt your mobile-friendly score and your rankings. This generator never blocks CSS or JavaScript by default, and it warns you if you try to add these rules.

Mistake 2: Using the Wrong Wildcard Pattern

Many people write Disallow: /?s= to block WordPress search result pages. This is wrong. The correct rule is Disallow: /*?s=. The * wildcard is very important here – without it, crawlers may only block the homepage search and miss all the other search result pages on your site.

Mistake 3: Accidentally Blocking the Entire Website

This happens more often than you would think. A developer sets Disallow: / during the development phase and forgets to remove it before the site goes live. The result: Google cannot crawl anything at all. Always check your live robots.txt file right after a site launch.

Mistake 4: Not Including Your Sitemap

Your sitemap URL should always be included inside your robots.txt file. This helps crawlers quickly find all your important pages. This generator adds your sitemap link automatically – you do not need to remember it.

Mistake 5: Forgetting About AI Bots

Many website owners do not realize that AI company crawlers are visiting their site and collecting their content for AI model training. If you want to control this, add specific disallow rules for GPTBot, ClaudeBot, Google-Extended, and other AI crawlers. This generator has a dedicated AI Bot Management section to help you do this easily.

Mistake 6: Blocking the /wp-includes/ Folder

Some older SEO guides still recommend blocking this folder. Do not do it. WordPress uses /wp-includes/ for core files that affect how your site is built and displayed. Blocking this folder can confuse Google about your site’s structure and cause rendering issues.

Robots.txt Syntax Guide – What Every Line Means

If you want to understand what each part of a robots.txt file does, here is a simple and clear breakdown of every directive.

User-agent

This line tells crawlers which bot the rules below apply to. Use * to apply rules to all bots. Use a specific name like Googlebot to target only Google’s crawler.

User-agent: * – applies rules to all crawlers
User-agent: Googlebot – applies rules to Google’s crawler only

Allow

This tells a crawler it is allowed to visit a specific page or folder. It is usually used to make an exception inside a broader Disallow rule.

Allow: /public/ – allows access to this folder
Allow: /wp-admin/admin-ajax.php – allows this one file even if /wp-admin/ is blocked

Disallow

This tells a crawler not to visit a specific page or folder.

Disallow: /wp-admin/    – blocks the WordPress admin area
Disallow: /*?s=        – blocks WordPress search result pages
Disallow: /cart/        – blocks the WooCommerce cart page

Sitemap

This points crawlers to your XML sitemap. You can include multiple Sitemap lines if your site has more than one sitemap.

Sitemap: https://yoursite.com/sitemap.xml
Sitemap: https://yoursite.com/news-sitemap.xml

Crawl-delay

This tells a crawler to wait a set number of seconds between each page request. It is useful if your server is slow and struggles with too many requests at once. Note: Google ignores this directive, but Bing and other search engines respect it.

Crawl-delay: 5 – wait 5 seconds between each request

Platform-Specific Robots.txt Rules

Different platforms have different URLs that should or should not be crawled. Here is what I recommend for the most popular platforms:

Platform	Block These	Always Keep Open
WordPress Blog	/wp-admin/, /?s=, /?replytocom=	/wp-admin/admin-ajax.php, /wp-content/
WordPress + WooCommerce	/cart/, /checkout/, /my-account/, /?add-to-cart=, /shop/?orderby=	Product pages, /wp-content/
Shopify	Session URLs (Shopify handles these automatically)	Product pages, collection pages
WordPress + Elementor	Admin areas only	/wp-content/uploads/elementor/, /wp-content/plugins/elementor/
News / Magazine Sites	Admin, author RSS feeds, sorting and filter URLs	All published article URLs

⚠️ Important for Elementor Users: Never block /wp-content/uploads/elementor/ or /wp-content/plugins/elementor/. Elementor stores your page styles in these folders. Blocking them causes Google to render your pages incorrectly, which can hurt your rankings and mobile-friendly score.

Robots.txt vs Noindex – What Is the Difference?

This is one of the most common questions I get from website owners. Both control whether Google shows a page in search results – but they work in very different ways.

	Robots.txt	Noindex Tag
How it works	Tells Google not to visit the page at all	Google visits the page, reads the tag, then removes it from search results
The catch	Google can still index the page if other sites link to it	Google must be able to crawl the page to read the tag
Best used for	Admin areas, staging sites, technical pages	Thank-you pages, thin content, faceted navigation pages

⚠️ Important rule: Never block a page with robots.txt AND add a noindex tag to it at the same time. If Google cannot crawl the page, it cannot read the noindex tag – so the tag does nothing.

Robots.txt and Crawl Budget – What Every Site Owner Needs to Know

Crawl budget is the number of pages Googlebot will crawl on your website within a given time period. Google decides this based on your site’s authority, server speed, and how often your content changes.

For small websites with fewer than 1,000 pages, crawl budget is usually not a concern. Google will crawl everything it finds.

For larger sites – e-commerce stores, news sites, or any site with thousands of pages – crawl budget becomes very important. If Google spends its crawl budget on your filter pages, session URLs, and duplicate content, it does not have time left to crawl your new products or blog posts.

A well-written robots.txt file helps by:

Blocking low-value pages that waste crawl budget
Pointing crawlers directly to your sitemap
Stopping crawlers from following endless URL parameter combinations
Preventing crawlers from getting stuck in infinite loops – like calendar pages or infinite scroll

I have seen cases where fixing a robots.txt file alone doubled a site’s crawl rate within one month – leading directly to faster indexing of new content and better search rankings.

Frequently Asked Questions About Robots.txt

If your site has no robots.txt file, search engines will crawl everything they can find. For small sites, this is often fine. But for platforms like WordPress, it means admin pages, search result pages, and other technical URLs get crawled unnecessarily – wasting crawl budget. It is always better to have a properly written robots.txt file.

Robots.txt does not directly improve or hurt rankings on its own. But it indirectly affects rankings by controlling which pages get crawled and indexed. If crawlers waste budget on the wrong pages, your important pages may not get indexed quickly. That slows down how fast new content starts ranking in Google.

Yes. You can block any specific bot by name, or block all bots at once using User-agent: *. However, some bots – especially malicious crawlers and scrapers – do not follow robots.txt rules at all. For real security, use server-level access controls or authentication rather than relying on robots.txt alone.

Almost never. Blocking Googlebot means Google will not crawl your site – and your pages will not appear in Google search results at all. The only valid reason to block Googlebot is if you are running a staging or development site that you do not want indexed yet.

Yes. You can have completely separate rule sets for different bots in the same robots.txt file. For example, you might allow all search engine bots to access your site freely, while blocking AI training bots like GPTBot from visiting any page.

Each group of rules starts with a User-agent line, followed by one or more Allow or Disallow lines. Leave one blank line between groups. Add Sitemap lines at the bottom of the file. The file must be saved as plain text with UTF-8 encoding – not as a Word document or HTML file.

No. Google automatically finds and reads your robots.txt file on its own. However, you can check it inside Google Search Console to see any errors or to test how specific URLs are handled by your current rules.

Yes. You can block specific file types using wildcard characters. For example, writing Disallow: /*.pdf$ will block all PDF files on your site from being crawled. The dollar sign ($) at the end marks the end of the URL pattern – meaning it only matches URLs that end in .pdf.

Update it whenever your website structure changes. If you add new sections, change your URL structure, or launch new features, review your robots.txt to make sure the right pages are open to crawlers and the right pages are still blocked. A quick check after any major site change is a good habit.

About This Tool – Why I Built the Robots.txt Generator

My name is Deep Rahul. I am an SEO consultant with over 8 years of hands-on experience helping businesses rank on Google.

I built this generator because I was tired of seeing the same mistakes over and over again. Most free robots.txt tools online generate basic files that are not specific to any platform. They do not warn you about common mistakes. They do not understand the difference between a WordPress blog and a WooCommerce store.

This generator is different. Here is what makes it stand out:

🎯 Platform-aware rules – Automatically suggested rules for 20+ platforms including WordPress, Shopify, Magento, and more
🤖 Bot management – Control access separately for search engine bots, social crawlers, SEO tools, and AI training bots
👁️ Live preview with validation – See your robots.txt update in real time with a health score that checks for errors
✅ SEO-accurate rules – No outdated or incorrect advice unlike many other generators out there
🆓 Free forever – No account required, no usage limits, no paywalls

You can verify the accuracy of this tool by testing your generated file in Google Search Console’s robots.txt tester.

✍️

Deep Rahul

SEO Consultant | 8+ Years of Experience | deeprahulseo.com
I help businesses rank on Google through technical SEO, content strategy, and link building. This guide was written and verified for accuracy in April 2026.

📚 Sources: Google’s official robots.txt documentation (developers.google.com), Google Search Central blog on crawl budget management, Google’s official guidance on robots.txt syntax and directives, and 8+ years of hands-on website audits and SEO campaigns.