|
Posted by Erland Sommarskog on 11/30/05 11:01
serge (sergea@nospam.ehmail.com) writes:
> I am reading "SQL Server Query Performance Tuning Distilled",
> on page 104 it talks about one of the index design recommendations
> which is to choose the column that has very high selectivity of values
> instead of a column that has very few selectivity of values.
>
> My question is if I have currently indexes on my tables that have
> 1, 2, 3, 4, ... values only on thousands of rows, are these nonclustered
> indexes pretty much useless indexes that I should get rid of?
>
> And I know that pretty much the number of selectivity values will
> always remain very low.
As always in the database world, it depends. An index on a bit column sound
like a bad idea in general, but consider this query:
SELECT ... FROM tbl WHERE unprocessed = convert(bit, 0)
Typically in such a table, there will be only a small number of unprocessed
rows, so the column is very selective for unprocessed = 0, and you almost
need an index on unprocessed here. (And for the index to be useful, you need
the convert as well, a subtlety with SQL Server data-type precedence.)
It also matters here whether the index is clustered or not. To continue with
the bit column, a non-clustered index on a bit column with a 50/50 split
is useless (almost see below), where as a clustered index actually reduces
the scan to only half of the table. Take this a little further and consider
a column with ten different values with equal distribution. The non-
clustered index is still not much of use, where as a clustered index reduces
the reads for a query like:
SELECT ... FROM tbl WHERE col = 'G' AND ...
to 10% of a full scan.
The reason the non-clustered index is useless, is because the optimizer
will find it more expensive to seek the index and then look up rows from
the data pages.
But all this changes if all you read is columns from the index. Consider
the bit column with a 50/50 split, and assume that you often need to run
SELECT bitcol, COUNT(*) FROM tbl GROUP BY bitcol
The non-clustered index is now a covering index and very useful.
So bottom line is: good indexes are indexes that are used.
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
[Back to original message]
|