🤖

Robots.txt & Sitemap Validator

Check your robots.txt rules and sitemap structure to ensure search engines can crawl your site correctly.

What this tool checks

robots.txt — fetches your site's robots.txt and parses all user-agent rules, allow/disallow directives, and declared sitemaps.

Sitemap — validates a sitemap XML URL, counts URLs, and lists sub-sitemaps if it's a sitemap index.

Robots.txt and XML Sitemap Validator Online

Ensure search engine crawlers (like Googlebot and Bingbot) can parse your site rules without any errors. Check sitemap links indexation blocks and view clean metadata.

1. Enter Sitemap / Robots Link

Paste the full URL of your `robots.txt` or `sitemap.xml` file. Our validator fetches the raw code instantly.

2. Inspect Index Blocks

See if user-agent rules match standard web crawl guidelines. Read nested sub-sitemaps cleanly.

Why Use robots.txt and Sitemap Files?

Robots.txt: Tells crawlers which sections of your site to ignore (like `/admin` dashboards or `/tmp` directories), optimizing crawl budgets.
XML Sitemaps: Acts as a roadmap listing all high-quality pages to help Google discover new URLs quickly.
User-Agents: Specify strict rules for specific bots (e.g. blocking AI scrapers like GPTBot).

Frequently Asked Questions

Where should my robots.txt file be located?

It must be placed at the absolute root of your site domain (e.g. `domain.com/robots.txt`). Storing it in subfolders will cause search engines to ignore the file.

What is a Sitemap Index?

A sitemap file that contains links to other sitemaps. Index sitemaps are used to bypass the 50,000 URL limit of standard sitemaps.