|
Posted by Erland Sommarskog on 12/05/80 11:29
(gilles27@talk21.com) writes:
> I have heard that, in the case of compound primary keys, it is
> important to select the correct order for the columns within the key.
> For instance, imagine a table called OrderLine which has primary key
> columns as follows-
>
> Ledger
> OrderNumber
> OrderLineNumber
>
> The theory I have heard is that columns with the most distinct values
> should come first. In this case, Ledger is likely to have a maximum of
> 6 distinct values, OrderNumber a maximum of 10 million and OrderLine up
> to 99. Based on this supposition I believe the best order would be-
>
> OrderNumber
> OrderLineNumber
> Ledger
>
> I have performed a few rudimentary tests that appear to bear this out.
> I was wondering if anyone else has tried something similar and if so
> what was the result?
Nah, the order in a primary key should rather reflect the logical
hierarchy. If an OrderLine can be broken into pieces called ledgers
the above would be the right model. I would guess, though, that different
ledgers have different series for order numbers, and thus the ledger is
the main key here.
I don't know the business rules here, but it seems to be that
SELECT * FROM OrderLine WHERE OrderNumber = 1234 is not even
meaninful, because you may get to see data from six different and
unrelated orders.
Now, not all tables with a compound key has a clear hiearchy. Consider
this table:
CREATE TABLE bankholidays(
coucode aba_coucode NOT NULL,
bhodate aba_date NOT NULL,
reguser aba_upduser NOT NULL,
moddate aba_updtime NOT NULL,
CONSTRAINT pk_bho PRIMARY KEY (coucode, bhodate)
)
This table lists days Mon-Fri that are not business days in various
countries. If you want to know in which countries Christmas Eve is
not a business day, you would say
SELECT coucode FROM bankholidays WHERE bhodate = '20041224'
But if you want to compute the settledate for a given trade date
and instrument, the country code is the top-level key. (And since
we mainly use the table for the latter purpose, this is why coucode
comes first.)
Thus, for such a table, it is very common to have a index on the
reverse order of the key, or some reverse order for a key with three
or more columns.
What you should make special consideration, is which index should be
the clustered index. This is necessarily not the primary key. It
could be a different combination of the PK - or on a completely
different column.
Finally, one thing also to keep in mind is how you use the columns.
If you say things like:
SELECT * FROM tbl WHERE a = 1 AND b BETWEEN 12 AND 19
An index on (a,b) is better, even if a only has 10 possible values and
b has 100.
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp
[Back to original message]
|