|  | Posted by Gert-Jan Strik on 01/05/07 00:20 
Hi Dave,
 I tried to create a situation similar to yours on SQL Server 2000. What
 I noticed is:
 - Without the extra predicate, the optimizer does not seem smart enough
 to chose SEEK:([b].[email] IsNotNull) instead of a full Index Scan
 - You are correct that the change in access method is not achieved if
 the predicate is added to the WHERE clause. It seems it must be part of
 the join ON clause
 - If either "b.email IS NOT NULL" or "f.email IS NOT NULL" is added, the
 join type remains MERGE JOIN. If both are added to the ON clause I get a
 HASH JOIN
 - I am not sure if the MERGE JOIN is faster or if the HASH JOIN is, but
 in either case you may want to add the two "unnecessary" predicates
 
 There is a thresshold somewhere (depending on your specific situation)
 where merge-joining or hashing the two partially scanned indexes is
 faster. Up to that point, the fastest plan will be to partially scan the
 index of filter and loop join and seek each matching row in base. In all
 cases it will be faster if NULL rows are excluded from the index seeks.
 
 HTH,
 Gert-Jan
 
 
 Dave wrote:
 >
 > Guys I am really stuck on this one.  Any help or suggestions would be
 > appreciated.
 >
 > We have a large table which seemed to just hit some kind of threshold.
 > They query is somewhat responsive when there are NO indexes on the
 > table.  However, when we index email the query takes forever.
 >
 > FACTS
 > - The problem is very "data specific".  I can not recreate the
 > problem using different data.
 > - There is only a problem when I index email on the base table.
 > - The problem goes away when I add "AND b.email IS NOT NULL" to the
 > inner join condition.  It does not help when I add the logic to the
 > "WHERE" clause.
 >
 > DDL
 > CREATE TABLE base (bk char(25), email varchar(100))
 > create clustered index icx on base(bk)
 > create index ix_email on base(email)
 >
 > CREATE TABLE filter (bk char(25), email varchar(100))
 > create clustered index icx on filter (bk)
 > create index ix_email on filter (email)
 >
 > Query
 > SELECT   b.bk, b.email
 > FROM base b WITH(NOLOCK)
 >         INNER JOIN filter f ON f.email = b.email
 >                                 --and f.email is not null
 >
 > Data Profile
 > --35120500, 35120491, 14221553
 > SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
 > FROM base
 >
 > --16796199, 16796192, 14221553
 > SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
 > FROM base
 > WHERE email IS NOT NULL
 >
 > --250552, 250552, 250205
 > SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
 > FROM filter
 >
 > --250208, 250208, 250205
 > SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
 > FROM filter
 > WHERE email IS NOT NULL
 [Back to original message] |