What is robots.txt in SEO?

The robots exclusion protocol (REP), or robots.txt is a small text file for restricting bots from a website or certain pages on the website. Using robots.txt and with a disallow direction, we can restrict search engine crawling programs from websites and or from certain folders and files.

The Robots.txt file is a text file placed on your web server which tells web crawlers like Googlebot if they should access a file or not.

Examples of robots.txt:

Robots.txt file URL :
https://www.example.com/robots.txt

Blocking all web crawlers from all content
User-agent: *
Disallow: /
Using this syntax in the robots.txt file would tell all web crawlers not to crawl any pages of the website, including the homepage.

Allowing all web crawlers access to all content
User-agent: *
Disallow:
Using this syntax in the robots.txt file tells web crawlers to crawl all pages of the website, including the homepage.

Blocking specific web crawler from a specific folder
User-agent: Googlebot
Disallow: /example-subfolder/
This syntax tells only Google’s crawler not to crawl any pages that contain the URL string.

Role of robots.txt file in SEO?

Improper usage of robots.txt file may lead to a decrease in your rank.
The robots.txt file can control how search engine spiders see and it can interact with your webpages.
This file is mentioned in several google algorithms.
This file and google bot can interact and plays an important role in working on a web page in search engine

Common mistakes while adding Robots.txt file

Ignoring disallow directives for a specific user-agent block
One robots.txt file for different subdomains
Listing of secure directories
Blocking relevant pages
Adding a relative path to the sitemap
Ignoring slash in a Disallow field.
Forgetting about case sensitivity

Comments

AnonymousDecember 19, 2019 at 12:01 AM
Very Informative
ReplyDelete
Replies

Add comment

Digital Marketing

Search This Blog