Robots.txt -- Growth Glossary

Robots.txt

Robots.txt is a text file at a website's root that instructs search engine crawlers which pages or sections they are allowed or disallowed from accessing.

Robots.txt provides crawler directives using a simple syntax of User-agent, Allow, and Disallow rules. It is the first file search engine bots check when visiting a domain, making it a critical control point for managing crawl behavior and budget.

Common use cases include blocking admin pages, staging environments, duplicate content paths, and resource-heavy pages that consume crawl budget without providing SEO value. However, robots.txt blocks crawling, not indexing -- if other pages link to a disallowed URL, it may still appear in search results.

GenGrowth audits robots.txt configurations to ensure valuable content is not accidentally blocked and crawl budget is spent efficiently. The platform generates optimized robots.txt rules based on site structure and content priorities.

Let GenGrowth handle this automatically for your product

Learn More

Robots.txt

How GenGrowth Helps

Related Terms