Re: Stopping robots searching particular page — HTML

You are here: Re: Stopping robots searching particular page « HTML « IT news, forums, messages

Posted by Nikita the Spider on 09/13/07 15:27

In article <fc8hq8.2g0.1@dylanparry.com>,
Dylan Parry <usenet@dylanparry.com> wrote:

> Jukka K. Korpela wrote:
>
> >> 1. Someone posts the URL to a newsgroup.
> >> 2. You forget to turn off the webserver's AutoIndex or similar, so the
> >> spider can just navigate its way to the url going through auto
> >> generated directory indexes.
> >>
> > 3. The page _was_ linked to from another page.
> >
> > 4. An indexing robot generates URLs automatically, more or less at random,
> > and tries them. It might for example try servers known to exist and append
> > to the server name some strings that are known to be common for web pages,
> > like /help.htm, /news.html....
>
> 5. Someone visits your page[1] and has the Google Toolbar (or others
> similar things) installed and reporting back to Google about the sites
> they are visiting, thus allowing Google to add the site to their index.

6. Someone sends the URL in an email via a mail service (like GMail)
that's also related to a search engine.

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more

Navigation:

Next in forum: Re: Stopping robots searching particular page
Prev in forum: Re: search robots visits or doesn`t visit this page
Thread view: Re: Stopping robots searching particular page

[Reply to this message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация