What is a robots.txt file

The robots.txt file is a simple text file that sits in the root directory of your website. It gives instructions to search engine crawlers about which pages to crawl on your website. Valid instructions are based on the robots exclusion standard, which will be discussed during this article. These instructions are by way of User-Agent and Disallow.

The combination of both User-Agent and Disallow tell search engine crawlers which URLs they are prevented from crawling on your website. A robots.txt file that contains only User-Agent: * Disallow: / is perfectly valid. In this case, the instructions given to crawlers is to prevent the entire site from being crawled.

How to use robots.txt

Search engine crawlers will check your robots.txt file before crawling the URLs on your website. If there are particular pages or sections of your site you don’t want to be crawled, pages that are not helpful to be included in search engine results pages, then robots.txt can be used to Disallow them from the crawl.

The most useful reason to include and maintain a robots.txt file is for optimising the crawl budget. Crawl budget is a term used to describe how much time and resources any search engine crawlers will spend on your site. The issue you are trying to address is when those crawlers waste crawl budget by crawling pointless or unwanted pages on your website.

Robots.txt is a vital component of your SEO efforts. If it is configured poorly or misconfigured, your SEO efforts could experience real impacts. You will want to make sure you have not mistakenly blocked crawlers from the most important parts of your website.

SEO Robot File Website Devlopement

Read our latest blogs

Stay up to date with the latest blogs and tips of website and marketing.