|
Posted by Ben C on 09/12/07 09:55
On 2007-09-11, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:
> Scripsit Tina Peters:
>
>> If its not linked to any other webpage, in any way, it shouldn't be
>> spidered.
>
> Yet it may be spidered. Actually, it would be an interesting exercise in a
> course on web issues to ask the students list down 10 possible situations
> where the page might be spidered.
>
> And to make the task a little more difficult, let's exclude the perhaps most
> obvious scenario: someone who knows the page address submits it to a search
> engine via its "Add URL" form.
1. Someone posts the URL to a newsgroup.
2. You forget to turn off the webserver's AutoIndex or similar, so the
spider can just navigate its way to the url going through auto
generated directory indexes.
What are the other 8?
Navigation:
[Reply to this message]
|