|  | Posted by Erland Sommarskog on 06/17/80 11:29 
(gilles27@talk21.com) writes:> I have heard that, in the case of compound primary keys, it is
 > important to select the correct order for the columns within the key.
 > For instance, imagine a table called OrderLine which has primary key
 > columns as follows-
 >
 > Ledger
 > OrderNumber
 > OrderLineNumber
 >
 > The theory I have heard is that columns with the most distinct values
 > should come first. In this case, Ledger is likely to have a maximum of
 > 6 distinct values, OrderNumber a maximum of 10 million and OrderLine up
 > to 99. Based on this supposition I believe the best order would be-
 >
 > OrderNumber
 > OrderLineNumber
 > Ledger
 >
 > I have performed a few rudimentary tests that appear to bear this out.
 > I was wondering if anyone else has tried something similar and if so
 > what was the result?
 
 Nah, the order in a primary key should rather reflect the logical
 hierarchy. If an OrderLine can be broken into pieces called ledgers
 the above would be the right model. I would guess, though, that different
 ledgers have different series for order numbers, and thus the ledger is
 the main key here.
 
 I don't know the business rules here, but it seems to be that
 SELECT * FROM OrderLine WHERE OrderNumber = 1234 is not even
 meaninful, because you may get to see data from six different and
 unrelated orders.
 
 Now, not all tables with a compound key has a clear hiearchy. Consider
 this table:
 
 CREATE TABLE bankholidays(
 coucode   aba_coucode  NOT NULL,
 bhodate   aba_date     NOT NULL,
 reguser   aba_upduser  NOT NULL,
 moddate   aba_updtime  NOT NULL,
 CONSTRAINT pk_bho PRIMARY KEY (coucode, bhodate)
 )
 
 This table lists days Mon-Fri that are not business days in various
 countries. If you want to know in which countries Christmas Eve is
 not a business day, you would say
 
 SELECT coucode FROM bankholidays WHERE bhodate = '20041224'
 
 But if you want to compute the settledate for a given trade date
 and instrument, the country code is the top-level key. (And since
 we mainly use the table for the latter purpose, this is why coucode
 comes first.)
 
 Thus, for such a table, it is very common to have a index on the
 reverse order of the key, or some reverse order for a key with three
 or more columns.
 
 What you should make special consideration, is which index should be
 the clustered index. This is necessarily not the primary key. It
 could be a different combination of the PK - or on a completely
 different column.
 
 Finally, one thing also to keep in mind is how you use the columns.
 If you say things like:
 
 SELECT * FROM tbl WHERE a = 1 AND b BETWEEN 12 AND 19
 
 An index on (a,b) is better, even if a only has 10 possible values and
 b has 100.
 
 --
 Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
 
 Books Online for SQL Server SP3 at
 http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp
  Navigation: [Reply to this message] |