|
Posted by Jason Barnett on 10/12/82 11:14
Brian Dunning wrote:
> I have a MySQL database with about a million records. I'd like to use
> the SQL command "order by RAND()" but my ISP won't let me: whenever the
> server gets spidered, Google overloads their MySQL server because of
> all the overhead of that command. I can't just cloak the spiders
> because I need them to find the random pages.
>
> So...what I've been doing for small sets of records is to use PHP to
> generate a bunch of random record ID's, then I construct a long SQL
> statement to find all the matching records. This works, but if I want
> to generate a big index page to list a hundred or a thousand records,
> it could get pretty clunky.
>
> Anyone have any better suggestions? :)
Not sure that this would work (since I've never done it :) but perhaps
you could create a static page specifically for the web spiders. The
basic plan is this:
- Create a web page with the record set (as you're already doing it)
- Save this web page to your server's cache (either roll your own or you
can use Cache_lite or whatever)
- Use an 8 day expiration on this file
- Then once a week you update this cache file from a cron job
- Now the key part: you alter your robots.txt file so that the spiders
will not go to the dynamically created page, but rather they follow to
the cached / static page.
Oh and it this actually works for you... it would be nice to get some
feedback.
Navigation:
[Reply to this message]
|