|
Posted by Alex Kuznetsov on 01/12/07 20:50
Emin wrote:
> Dear Experts,
>
> Thanks for the comments so far. To help describe the problem, I
> constructed a stand alone example that illustrates the issue on MS SQL
> Server 2005. If you run the following query, it will take a very long
> time. But if you replace "FULL OUTER" with "INNER" or if you remove the
> WHERE clause, the query runs fine. Any ideas why FULL OUTER JOIN blows
> things up even though there are no NULLs in sight?
>
> Thanks,
> -Emin
>
> ----------------------------------------------------
>
> IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
> OBJECT_ID(N'dbo.temp'))
> DROP TABLE dbo.temp
>
> ;with TallyTableCTE(i)
> as
> (
> select i = {ts '1970-01-01 00:00:00'}
> union all
> select i = i + 1 from TallyTableCTE where i < {ts '2007-01-01
> 00:00:00'}
> )
> ,AnotherTallyTableCTE(j)
> as
> (
> select j = {ts '1970-01-01 00:00:00'}
> union all
> select j = j + 1 from AnotherTallyTableCTE where j < {ts '2007-01-01
> 00:00:00'}
> )
> select *
> into dbo.temp FROM
> TallyTableCTE x
> FULL OUTER JOIN --FULL OUTER JOIN with WHERE clause causes problem
> --INNER JOIN --using INNER JOIN instead of FULL OUTER JOIN removes
> problem
> AnotherTallyTableCTE y
> ON x.i = y.j
> where x.i > {ts '1980-01-01 00:00:00'} --removing WHERE clause removes
> problem
> option (maxrecursion 30000)
>
> IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
> OBJECT_ID(N'dbo.temp'))
> DROP TABLE dbo.temp
>
> --------------------------------
>
> On Jan 12, 11:14 am, "Emin" <emin.shop...@gmail.com> wrote:
> > Dear Experts,
> >
> > I have a fairly simple query in which adding a where clause slows
> > things down by at least a factor of 100. The following is the slow
> > version of the query
> >
> > -------------------------
> > SELECT * FROM
> > ( Select x.event_date From x FULL OUTER JOIN y
> > ON x.event_date = y.event_date
> > ) innerQ
> > WHERE ( innerQ.event_date >= {ts '1980-01-01 00:00:00'} )
> > ------------------------
> >
> > Removing the where clause makes the query run quickly. This seems
> > extremely strange because it seems like SQL Server should simply be
> > able to take the results of innerQ and discard anything with a date
> > that doesn't match. If I instead split the query into two pieces where
> > I create a temp table and put innerQ into that and then do the select *
> > WHERE (...) from the temp table things work fine.
> >
> > Any thoughts on what SQL Server might be doing to make things slow and
> > how I can fix it?
> >
> > Thanks,
> > -Emin
I guess in real life situations the optimizer will estimate selectivity
of x.event_date > @YourParameter based on statistics. For inner joins
the other side does not contribute nulls, so its estimate might be
better, so it goes for a hash join.
Also note that full outer joins are rarely used in production, and as
such they have less chances of geting a good execution plan.
Your particular full outer join query is misleading and I would say it
is a very poor practice.
Because in fact your query is a left outer join, you should either
explicitly say LEFT OUTER JOIN or push your predicate down.
The reason is simple: poorly programmed queries have less chances of
geting a good execution plan.
-----------------------
Alex Kuznetsov
http://sqlserver-tips.blogspot.com/
http://sqlserver-puzzles.blogspot.com/
Navigation:
[Reply to this message]
|