| 
	
 | 
 Posted by dmarkle on 01/08/07 18:28 
Well, it comes mostly from relational and set theory.  I like to 
explain the need for PK's like this: 
 
If your wife makes you go to the grocery store and she asks you what 
you bought, you'd give a set back as your answer: 
 
"Peas, carrots, potatoes, and Coca-Cola" 
 
You wouldn't say: 
 
"Peas, carrots, potatoes, peas, and Coca-Cola". 
 
The two answers aren't equivalent.  One simply makes sense, and one 
does not.  For example, in set theory, you can express a set of numbers 
as: 
 
{1,2,3,4} 
 
but not: 
 
{1,2,2,3,4}.  The second representation is invalid. 
 
Like giving your grocery list or giving a set of numbers, the engine's 
theoretical foundation is based on relational theory, which is closely 
tied to set theory. 
 
If you want a book with a recommendation, "Inside SQL Server" as well 
as a lot of other DB books out there will tell you the same thing -- 
always have a PK on each table you create. 
 
I'm sorry, I'll admit that I don't have the hard evidence to prove to 
you that not having a PK is going to screw up the optimizer.  I do try 
to stay away from "cargo cult" type "I saw this once..." sort of 
recommendations, but in this case, since not having a PK on each table 
is such a no-no, I'd try this first.  I like #2 best.  If "email" is 
distinct in the table by itself, that's going to be even better... 
Maybe copy your data over to a development database and try these out 
and see how it goes. 
 
Good luck. 
-Dave 
 
Dave wrote: 
> I like your explanation! 
> 
> The table does not currently have a primary key. 
> 
> I basically have options 2 and 4 available; 1 and 3 are out of my 
> control. 
> 
> 
> I will lobby for adding an identity column to this table.   It happens 
> to be very wide with 12 single column indexes so I really don't want 
> to make the clustered super wide. 
> 
> Just for clarities sake, in order for the optimizer to properly 
> understand the tables, do I have to add a PK constraint to the identity 
> column? 
> 
>  >Any table without a 
> > PRIMARY KEY or otherwise unique key is likely going to cause problems 
> > with the optimizer, and it often happens in bizarre ways like this. 
> 
> Do you know of any book or other reference to this (specifically how 
> the optimizer needs a key)?  It would really help me convince everyone 
> to use keys on everything. 
> 
> Thanks! 
> 
> 
> 
> dmarkle wrote: 
> > I suspect the reason (...AND ... IS NOT NULL) is helping you out is 
> > that it's forcing the optimizer to choose a table scan instead if 
> > hitting up that index you made on email. 
> > 
> > Before you go changing minutiae like ANSI_NULLS, do realize that your 
> > tables are, (relationally speaking), nonsense.  Any table without a 
> > PRIMARY KEY or otherwise unique key is likely going to cause problems 
> > with the optimizer, and it often happens in bizarre ways like this. 
> > 
> > You should do the following: 
> > 1) Make the clustered index on "bk" UNIQUE.  If you can't do that... 
> > 
> > 2) Add email to your clustered index as a secondary key and recreate 
> > the clustered index as UNIQUE, or make the combination of the two 
> > columns a PRIMARY KEY.  If you can't do that... 
> > 
> > 3) Analyze your data and fix your process to not allow wholly duplicate 
> > rows, which are nonsensical.  If you can't do THAT (ugh): 
> > 
> > 4) Add an IDENTITY column to your table.  So as not to drastically 
> > increase your index sizes, you could either add this column as a 
> > secondary key after bk and make the clustered index UNIQUE, or you 
> > could create a nonclustered PRIMARY KEY on that identity. 
> > 
> > Let me know how it goes. 
> > -Dave Markle 
> > http://www.markleconsulting.com 
> > 
> > 
> > 
> > 
> > 
> > Erland Sommarskog wrote: 
> > > Dave (daveg.01@gmail.com) writes: 
> > > > Erland that was an excellent post!  However it was not my problem.  The 
> > > > code is run from .NET application and I am testing using SQL Management 
> > > > Studio. 
> > > > 
> > > > I added this to the test script and it did not help. 
> > > > SET ANSI_NULLS  ON 
> > > 
> > > As long it's a loose script, ANSI_BULLS would be on by default. I will 
> > > have to admit that I was clutching a straws. Without access to the 
> > > database seeing it, it is very difficult to analyse the problem accurately. 
> > > It's not alwyas that easy even if you have full acecss. 
> > > 
> > > 
> > > 
> > > -- 
> > > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se 
> > > 
> > > Books Online for SQL Server 2005 at 
> > > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx 
> > > Books Online for SQL Server 2000 at 
> > > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
  
Navigation:
[Reply to this message] 
 |