Aug 21
The robots.txt file is composed of instructions Use-agent: , of Disallow instructions: , of comments and blank lines.
Instructions intended for all the robots
| The Disallow instructions: who follow the line Use-agent: * apply to all the robots, except with those to which particular instructions are given. The file can contain only one line Use-agent: *.The Disallow instructions: ask the robots not to visit the addresses starting with the text mentioned. |
Examples
| They will be following form: | ||||
| Use-agent: * Disallow: /en-test/version_12/ Disallow: /brouillons/un-fichier.php |
||||
| In our example, we ask all the robots not to visit the addresses starting with /en-test/version_12/ or /brouillons/un-fichier.php.Addresses such as /en-test/version_12/index.html or /en-test/version_12/ma_photo.jpg should not thus be visited. The same applies to addresses such as /brouillons/un-fichier.php or /brouillons/un-fichier.php? to pays=maroc&ville=agadir.
To prohibit any access to the site, this robots.txt will be used: |
||||
| Use-agent: * Disallow: / |
||||
| And a last example: | ||||
| Use-agent: * Disallow: /faq-robots |
||||
| This instruction relates to the access at any address which starts with /faq-robots. It prohibits the access to the /faq-robots.html file as well, as with the contents of a /faq-robots/ repertory. |
Instructions intended for a particular robot
| The Disallow instructions: who follow the line Use-agent: name-of-robot apply only to the robot indicated.The Disallow instructions:ask this robot not to visit the addresses starting with the text mentioned. |
Examples
| Instructions standards, intended for Googlebot (Google), will be following form: | ||||
| Use-agent: Googlebot Disallow: /copies/ Disallow: /chuuuuut.html |
||||
| Instructions, intended for the two robots Slurp (Yahoo! Search) and msnbot (MSN Search), will be able to be written like this: | ||||
| Use-agent: Slurp Use-agent: msnbot Disallow: /fichier/ Disallow: /tutorial.php |
August 21st, 2007 at 9:17 am
[…] the protocol of exclusion of the robots and thus begins any indexing with the consultation of the robots.txt file (if you do not have any, that thus generates errors 404, therefore it is to better put one of them, […]
March 5th, 2008 at 4:14 am
[…] Return to main contents […]