|
Posted by Gert-Jan Strik on 01/05/07 00:20
Hi Dave,
I tried to create a situation similar to yours on SQL Server 2000. What
I noticed is:
- Without the extra predicate, the optimizer does not seem smart enough
to chose SEEK:([b].[email] IsNotNull) instead of a full Index Scan
- You are correct that the change in access method is not achieved if
the predicate is added to the WHERE clause. It seems it must be part of
the join ON clause
- If either "b.email IS NOT NULL" or "f.email IS NOT NULL" is added, the
join type remains MERGE JOIN. If both are added to the ON clause I get a
HASH JOIN
- I am not sure if the MERGE JOIN is faster or if the HASH JOIN is, but
in either case you may want to add the two "unnecessary" predicates
There is a thresshold somewhere (depending on your specific situation)
where merge-joining or hashing the two partially scanned indexes is
faster. Up to that point, the fastest plan will be to partially scan the
index of filter and loop join and seek each matching row in base. In all
cases it will be faster if NULL rows are excluded from the index seeks.
HTH,
Gert-Jan
Dave wrote:
>
> Guys I am really stuck on this one. Any help or suggestions would be
> appreciated.
>
> We have a large table which seemed to just hit some kind of threshold.
> They query is somewhat responsive when there are NO indexes on the
> table. However, when we index email the query takes forever.
>
> FACTS
> - The problem is very "data specific". I can not recreate the
> problem using different data.
> - There is only a problem when I index email on the base table.
> - The problem goes away when I add "AND b.email IS NOT NULL" to the
> inner join condition. It does not help when I add the logic to the
> "WHERE" clause.
>
> DDL
> CREATE TABLE base (bk char(25), email varchar(100))
> create clustered index icx on base(bk)
> create index ix_email on base(email)
>
> CREATE TABLE filter (bk char(25), email varchar(100))
> create clustered index icx on filter (bk)
> create index ix_email on filter (email)
>
> Query
> SELECT b.bk, b.email
> FROM base b WITH(NOLOCK)
> INNER JOIN filter f ON f.email = b.email
> --and f.email is not null
>
> Data Profile
> --35120500, 35120491, 14221553
> SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
> FROM base
>
> --16796199, 16796192, 14221553
> SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
> FROM base
> WHERE email IS NOT NULL
>
> --250552, 250552, 250205
> SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
> FROM filter
>
> --250208, 250208, 250205
> SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)
> FROM filter
> WHERE email IS NOT NULL
Navigation:
[Reply to this message]
|