|
Posted by Alexander Kuznetsov on 04/17/06 18:35
Erland Sommarskog wrote:
> I'm not really sure what you mean with clustering factor.
To make long story short,
suppose you have a Customer table clustered on phone number, having on
average 20 rows per page. Suppose you want to retrieve customers with
DOB between January 1st and January 15th, which is about 4% of data.
Because phone number and date of birth are not correlated, qualifying
rows are scattered all over the table, and it is very likely that there
is a customer with DOB between January 1st and January 15th on almost
every page. So Oracle/DB2 optimizer will look up clustering factor of
the index on DOB (it is low) and go for a table scan. On the other
hand, phone number and city are very correlated. As a result, if 10%
customers live in the city of Someville, the rows matching the criteria
city='SOMEVILLE' are located on adjacent pages, because they have phone
numbers with the same beginning. The index on city has a high
clustering factor. The Oracle/DB2 optimizer will choose to access the
table via the index on city, and it will be more efficient than a table
scan - almost 90% of data pages will not be read.
Navigation:
[Reply to this message]
|