|  | Posted by Gordon Burditt on 04/29/06 02:26 
>>> The first uses MySQL to pick a random row in a single statement:>>> <?php
 >>> $sql = "SELECT FROM myTable ORDER BY RAND() LIMIT 1";
 >>> $query = mysql_query($sql,$conn);
 >>> ?>
 >>>
 >>>
 >>> The second does the same thing, but uses 2 queries to do it:
 >>> <?php
 >>> $sqlA = "SELECT COUNT(id) FROM myTable";
 >>> $queryA = mysql_query($sqlA,$conn);
 >>> $num = mysql_result($queryA,0,0);
 >>> $random = rand(1,$num);
 >>> $sqlB = "SELECT FROM myTable WHERE id='" . $random . "' LIMIT 1";
 >>> ?>
 >>>
 >>> Basically, which approach would be better to use in a high traffic
 >>> environment to retrieve a single random row?
 >>
 >> The problem here is that the second query is wrong for what you're
 >> trying to do. The number of rows in a table doesn't necessarily have
 >> anything kind of relationship with the primary keys. Rows could have
 >> been deleted or the seed value of the auto-increment column might not
 >> have been 1.
 >>
 >
 >What if I change it to say something like 'SELECT COUNT(column1) FROM
 >myTable WHERE live='0' LIMIT 1'
 
 Consider seriously what happens when the result of this will *NEVER*
 match a value for id.  For example, suppose id is always a credit
 card number (12-16 digits and unlikely to contain leading zeroes, as
 Amex, Mastercard, VISA, and Discover begin with 3, 4, 5, and 6, not
 necessarily in that order).
 
 >I understand the primary keys may not always return a reliable result, but
 >I'm questioning whether it is better to use one query to return a row vs.
 >using 2 queries to return a random row.
 
 Any code can be made infinitely fast and run in zero space if it doesn't
 have to return a correct answer.  And I consider no banner ad to always
 be preferable to a banner ad.
 
 >Right now we are experiencing issues with MySQL being able to keep up with
 >all of the traffic. The first SQL statement is used for ad banners, and our
 >hoster has mentioned that this is not the most efficient way to do what were
 >doing. The only other way I can see to retrieve a random row is to query for
 >the # of rows that match the criteria, then use that # to have PHP generate
 >a random # between 1 & that #, then re-query MySQL for the row that PHP
 >generated. After that I'll need to make sure that the row does actually
 >exist in the table. If it doesn't, I have to start again.
 
 If you have 10 rows in your table numbered 77, 83, 84, 85, 89, 92, 93, 94,
 95, and 97, you will *NEVER* find a row that exists and it will take
 an INFINITE number of queries to generate the page.
 
 >It's just to me, the second approach seems to have a lot more overhead in it
 >than the first.
 
 Gordon L. Burditt
  Navigation: [Reply to this message] |