|
Posted by Gordon Burditt on 04/29/06 02:26
>>> The first uses MySQL to pick a random row in a single statement:
>>> <?php
>>> $sql = "SELECT FROM myTable ORDER BY RAND() LIMIT 1";
>>> $query = mysql_query($sql,$conn);
>>> ?>
>>>
>>>
>>> The second does the same thing, but uses 2 queries to do it:
>>> <?php
>>> $sqlA = "SELECT COUNT(id) FROM myTable";
>>> $queryA = mysql_query($sqlA,$conn);
>>> $num = mysql_result($queryA,0,0);
>>> $random = rand(1,$num);
>>> $sqlB = "SELECT FROM myTable WHERE id='" . $random . "' LIMIT 1";
>>> ?>
>>>
>>> Basically, which approach would be better to use in a high traffic
>>> environment to retrieve a single random row?
>>
>> The problem here is that the second query is wrong for what you're
>> trying to do. The number of rows in a table doesn't necessarily have
>> anything kind of relationship with the primary keys. Rows could have
>> been deleted or the seed value of the auto-increment column might not
>> have been 1.
>>
>
>What if I change it to say something like 'SELECT COUNT(column1) FROM
>myTable WHERE live='0' LIMIT 1'
Consider seriously what happens when the result of this will *NEVER*
match a value for id. For example, suppose id is always a credit
card number (12-16 digits and unlikely to contain leading zeroes, as
Amex, Mastercard, VISA, and Discover begin with 3, 4, 5, and 6, not
necessarily in that order).
>I understand the primary keys may not always return a reliable result, but
>I'm questioning whether it is better to use one query to return a row vs.
>using 2 queries to return a random row.
Any code can be made infinitely fast and run in zero space if it doesn't
have to return a correct answer. And I consider no banner ad to always
be preferable to a banner ad.
>Right now we are experiencing issues with MySQL being able to keep up with
>all of the traffic. The first SQL statement is used for ad banners, and our
>hoster has mentioned that this is not the most efficient way to do what were
>doing. The only other way I can see to retrieve a random row is to query for
>the # of rows that match the criteria, then use that # to have PHP generate
>a random # between 1 & that #, then re-query MySQL for the row that PHP
>generated. After that I'll need to make sure that the row does actually
>exist in the table. If it doesn't, I have to start again.
If you have 10 rows in your table numbered 77, 83, 84, 85, 89, 92, 93, 94,
95, and 97, you will *NEVER* find a row that exists and it will take
an INFINITE number of queries to generate the page.
>It's just to me, the second approach seems to have a lot more overhead in it
>than the first.
Gordon L. Burditt
Navigation:
[Reply to this message]
|