|
Posted by Ken on 11/22/05 23:58
Hi Scott -
On Tue, 22 Nov 2005 13:24:25 -0600, Scott <golden@uslink.net> wrote:
>Steve Pugh wrote:
>>
>> Scott wrote:
>> > > Luigi Donatello Asero wrote:
>> > > > "Scott" <golden@uslink.net> skrev i meddelandet
>> > > > news:437FD314.421C1B77@uslink.net...
>> > > > >
>> > > > > Is there a tag that I can put on a page that will prevent search
>> > > > > engines from indexing the page?
>> > > >
>> > > > As far as I know you could insert the adress of the page into a file
>> > > > called robots.txt and indicate which search engine you do not want to
>> > > > index it.
>> >
>> > OK, I figured out what to write in robots.txt. What I'm wondering is exactly
>> > where to place that file on the host server.
>>
>> At the root of your site.
>>
>> If a spider wants to visit http://www.example.com/foo/bar/page.html
>> then it will look for http://www.example.com/foo/bar/robots.txt,
>> http://www.example.com/foo/robots.txt and
>> http://www.example.com/robots.txt and apply all the rules it finds.
>> >From your point of view having a single robots.txt in your root folder
>> makes for easy maintenance.
>>
>> Steve
>
>Steve,
>
>So, you're saying I can just upload the robots.txt file to the same place I
>upload all my website files? In my case, my web account on the server is
>"public_html". And I should configure robots.txt to exclude the one
>particular url that I wish not to be indexed?
In the example that Steve gave, according to the standards the robot
would look ONLY for:
http://www.example.com/robots.txt
I don't recall that I have ever seen a robot look for robots.txt other
than in the host root; certainly not in the last several years.
See http://www.robotstxt.org/wc/exclusion.html If you don't have
access to the host root, you can try using the "ROBOTS" META tag
within the individual page(s).
--
Ken
http://www.ke9nr.net/
Navigation:
[Reply to this message]
|