URL Rewritting Explanation Of GoogleBot
Aug 21

The robots.txt file is composed of instructions Use-agent: , of Disallow instructions: , of comments and blank lines.

Instructions intended for all the robots

The Disallow instructions: who follow the line Use-agent: * apply to all the robots, except with those to which particular instructions are given. The file can contain only one line Use-agent: *.The Disallow instructions: ask the robots not to visit the addresses starting with the text mentioned.




Examples

They will be following form:
Use-agent: *
Disallow: /en-test/version_12/
Disallow: /brouillons/un-fichier.php
In our example, we ask all the robots not to visit the addresses starting with /en-test/version_12/ or /brouillons/un-fichier.php.Addresses such as /en-test/version_12/index.html or /en-test/version_12/ma_photo.jpg should not thus be visited. The same applies to addresses such as /brouillons/un-fichier.php or /brouillons/un-fichier.php? to pays=maroc&ville=agadir.

To prohibit any access to the site, this robots.txt will be used:

Use-agent: *
Disallow: /
And a last example:
Use-agent: *
Disallow: /faq-robots
This instruction relates to the access at any address which starts with /faq-robots. It prohibits the access to the /faq-robots.html file as well, as with the contents of a /faq-robots/ repertory.

Instructions intended for a particular robot

The Disallow instructions: who follow the line Use-agent: name-of-robot apply only to the robot indicated.The Disallow instructions:ask this robot not to visit the addresses starting with the text mentioned.

Examples

Instructions standards, intended for Googlebot (Google), will be following form:
Use-agent: Googlebot
Disallow: /copies/
Disallow: /chuuuuut.html
Instructions, intended for the two robots Slurp (Yahoo! Search) and msnbot (MSN Search), will be able to be written like this:
Use-agent: Slurp
Use-agent: msnbot
Disallow: /fichier/
Disallow: /tutorial.php


2 Responses to “robots.txt: contents of the file”

  1. Explanation Of GoogleBot Says:

    […] the protocol of exclusion of the robots and thus begins any indexing with the consultation of the robots.txt file (if you do not have any, that thus generates errors 404, therefore it is to better put one of them, […]

  2. The world's largest network built around two hubs Says:

    […] Return to main contents […]

Leave a Reply