You are here: Re: Stopping robots searching particular page « HTML « IT news, forums, messages
Re: Stopping robots searching particular page

Posted by Nikita the Spider on 09/13/07 15:27

In article <fc8hq8.2g0.1@dylanparry.com>,
Dylan Parry <usenet@dylanparry.com> wrote:

> Jukka K. Korpela wrote:
>
> >> 1. Someone posts the URL to a newsgroup.
> >> 2. You forget to turn off the webserver's AutoIndex or similar, so the
> >> spider can just navigate its way to the url going through auto
> >> generated directory indexes.
> >>
> > 3. The page _was_ linked to from another page.
> >
> > 4. An indexing robot generates URLs automatically, more or less at random,
> > and tries them. It might for example try servers known to exist and append
> > to the server name some strings that are known to be common for web pages,
> > like /help.htm, /news.html....
>
> 5. Someone visits your page[1] and has the Google Toolbar (or others
> similar things) installed and reporting back to Google about the sites
> they are visiting, thus allowing Google to add the site to their index.

6. Someone sends the URL in an email via a mail service (like GMail)
that's also related to a search engine.

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация