|
Posted by Erland Sommarskog on 06/07/06 21:29
Alexander Kuznetsov (AK_TIREDOFSPAM@hotmail.COM) writes:
> My server has crashed and I cannot verify it right now, but I have a
> feeling that newID() will be calculated for every row, which may be
> inefficient. I was thinking of a more efficient way to get a random
> row. Suppose there is an identity column, then you could
> do something like this to get a random identity:
>
> select min(id) from table2,
> (select max(id) maxID, min(ID) minID from table2) t
> where table2.id > t.minID + (t.maxID - t.minID)*rand()
>
> What do you think?
That presumes that the values in the id column are decently evenly
distributed. If there are occasional large gaps, the distribution
will be skewed.
Yes, ORDER BY newid() + TOP 1 will read all rows, which for a large
table can be costly.
In SQL 2005 there is TABLESAMPLE, but apparently it's not that random
for smaller tables.
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Navigation:
[Reply to this message]
|