|
Posted by "Richard Lynch" on 10/20/50 11:19
On Wed, June 22, 2005 3:57 pm, Brian Dunning said:
> I'm using the following code in an effort to identify bots:
>
> $client = $_SERVER['HTTP_USER_AGENT'];
> if(!strpos($client, 'ooglebot') && !strpos($client, 'ahoo') && !strpos
> ($client, 'lurp') && !strpos($client, 'msnbot'))
> {
> (Stuff that I do if it's not a bot)
> }
>
> But it doesn't seem to be catching a lot of bot action. Anyone have a
> better list of user agents? (I left off the first letter of some to
> avoid case conflicts.)
Check your logfiles and/or web stats.
The most common bots should be pretty apparent.
Here's a hack that might be useful to you:
1. Change .htaccess thusly:
<Files robots.txt>
ForceType application/x-httpd-php
</Files>
2. Edit robots.txt:
<?php
error_log("robot_detected: $_SERVER[HTTP_USER_AGENT]");
?>
Since only legitimate robots read robots.txt, that should quickly generate
a list of legimate bots visiting your site.
You could even insert it into a database with a unique key on the value,
ignoring the errors of duplicates, and then you'd have the data already
filtered down to uniques. Be a bit slower than error_log, I should
think... Maybe.
Course, it won't help at all with the idiot illegitmate bots...
And this could be a bit too much for a real busy site...
Though you'd hope that the good bots (which read robots.txt) aren't
pounding you THAT hard...
--
Like Music?
http://l-i-e.com/artists.htm
Navigation:
[Reply to this message]
|