|
Posted by Richard Lynch on 10/16/67 11:14
On Tue, April 26, 2005 7:55 am, Brian Dunning said:
> I have a MySQL database with about a million records. I'd like to use
> the SQL command "order by RAND()" but my ISP won't let me: whenever
> the server gets spidered, Google overloads their MySQL server because
> of all the overhead of that command. I can't just cloak the spiders
> because I need them to find the random pages.
>
> So...what I've been doing for small sets of records is to use PHP to
> generate a bunch of random record ID's, then I construct a long SQL
> statement to find all the matching records. This works, but if I want
> to generate a big index page to list a hundred or a thousand records,
> it could get pretty clunky.
Google doesn't need to index random stuff.
In fact, you don't *WANT* Google to index random stuff, because then what
the person searching for gets is something entirely different from what
they want.
So, first off, put the random page in your robots.txt file as a NoIndex
Then, make a nice little template that takes an ID in the URL (using
$_SERVER['PATH_INFO']) and spits out the page in a NON-random fashion.
Quick lookup. Happy ISP.
Put a link on each page to the next ID in the database.
Tell Google to Index the first ID in your robots.txt file.
You don't need to have any links anywhere to those pages on your site for
Google to index them, if they are in your robots.txt file.
Google will follow the links (eventually) and get them all.
You might get better Google rankings from having the records listed in
Category groupings. Keep in mind that whatever double-secret algorithm
Google is using, its *PURPOSE* is to help people find stuff.
If you group related things on one page, you are helping people find more
stuff.
Google *must* like that better than isolated un-categorized information.
--
Like Music?
http://l-i-e.com/artists.htm
Navigation:
[Reply to this message]
|