Reply to uses of robots.txt — HTML — IT news, forums, messages

Posted by Math on 10/06/07 15:36

Hi,

There is something I really don't understand ; and I would like your
advises...

1. Some websites, (for instance news.google.fr) contains a
syndication feed (like http://news.google.fr/nwshp?topic=po&output=atom).

2. Theses websites have a robots.txt file preventing some robots
(declared by user-agents) from indexation.
For example : http://news.google.fr/robots.txt contains (extract) :
User-agent: *
Disallow: /nwshp

3. I've developped an syndication aggregator, and I woul'd like to
respect these robots.txt files. but as I can see and understand, my
user-agent isn't authorized to acces /nwshp?topic=po&output=atom
because of this robots.txt...

So, is it normal ? robots.txt files are only for indexation robots ?
to sum up, my syndication aggregator should respect these files or
not ?

Thanks.

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация