The robots exclusion protocol (REP), or robots.txt is a small text file for restricting bots from a website or certain pages on the website. Using robots.txt and with a disallow direction, we can restrict search engine crawling programs from websites and or from certain folders and files.
The Robots.txt file is a text file placed on your web server which tells web crawlers like Googlebot if they should access a file or not.
https://www.example.com/robots.txt
Blocking all web crawlers from all content
User-agent: *
Disallow: /
Using this syntax in the robots.txt file would tell all web crawlers not to crawl any pages of the website, including the homepage.
Allowing all web crawlers access to all content
User-agent: *
Disallow:
Using this syntax in the robots.txt file tells web crawlers to crawl all pages of the website, including the homepage.
Blocking specific web crawler from a specific folder
User-agent: Googlebot
Disallow: /example-subfolder/
This syntax tells only Google’s crawler not to crawl any pages that contain the URL string.
The Robots.txt file is a text file placed on your web server which tells web crawlers like Googlebot if they should access a file or not.
Examples of robots.txt:
Robots.txt file URL :https://www.example.com/robots.txt
Blocking all web crawlers from all content
User-agent: *
Disallow: /
Using this syntax in the robots.txt file would tell all web crawlers not to crawl any pages of the website, including the homepage.
Allowing all web crawlers access to all content
User-agent: *
Disallow:
Using this syntax in the robots.txt file tells web crawlers to crawl all pages of the website, including the homepage.
Blocking specific web crawler from a specific folder
User-agent: Googlebot
Disallow: /example-subfolder/
This syntax tells only Google’s crawler not to crawl any pages that contain the URL string.
Role of robots.txt file in SEO?
- Improper usage of robots.txt file may lead to a decrease in your rank.
- The robots.txt file can control how search engine spiders see and it can interact with your webpages.
- This file is mentioned in several google algorithms.
- This file and google bot can interact and plays an important role in working on a web page in search engine
Common mistakes while adding Robots.txt file
- Ignoring disallow directives for a specific user-agent block
- One robots.txt file for different subdomains
- Listing of secure directories
- Blocking relevant pages
- Adding a relative path to the sitemap
- Ignoring slash in a Disallow field.
- Forgetting about case sensitivity
Very Informative
ReplyDelete