|
Posted by Emin on 01/12/07 20:07
Dear Gert-Jan,
Thanks for the info. In my real-world problem I am doing a join on
views which seemed appropriate to simulate using CTEs in my previous
post. Based on your comments, I tried using OPTION(HASH JOIN) on my
real example and that fixed the problem.
Out of curiousity, how does one go about learning how to understand and
optimize the execution plan? For example, while I noticed that the
execution plans were different in my various cases, I never would have
known to try OPTION(HASH JOIN). Do you (or any other readers out there)
have recommendations on books or other sources for this information?
Thanks again,
-Emin
On Jan 12, 2:59 pm, Gert-Jan Strik <s...@toomuchspamalready.nl> wrote:
> Does your real situation actually resemble this repro?
>
> In the repro, it is the difference between a hash join and loop join.
> The hash join is fast, the loop join is not. It is not suprising that
> the loop join is not very fast, because there is quite a lot of
> calculation needed for the CTE.
>
> If this really is your problem, then you can probably add OPTION(HASH
> JOIN) to eleviate the performance problems.
>
> But if you are not actually joining two CTE's with deep recursion, then
> we probably need a more real life example.
>
> Gert-Jan
>
> Emin wrote:
>
> > Dear Experts,
>
> > Thanks for the comments so far. To help describe the problem, I
> > constructed a stand alone example that illustrates the issue on MS SQL
> > Server 2005. If you run the following query, it will take a very long
> > time. But if you replace "FULL OUTER" with "INNER" or if you remove the
> > WHERE clause, the query runs fine. Any ideas why FULL OUTER JOIN blows
> > things up even though there are no NULLs in sight?
>
> > Thanks,
> > -Emin
>
> > ----------------------------------------------------
>
> > IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
> > OBJECT_ID(N'dbo.temp'))
> > DROP TABLE dbo.temp
>
> > ;with TallyTableCTE(i)
> > as
> > (
> > select i = {ts '1970-01-01 00:00:00'}
> > union all
> > select i = i + 1 from TallyTableCTE where i < {ts '2007-01-01
> > 00:00:00'}
> > )
> > ,AnotherTallyTableCTE(j)
> > as
> > (
> > select j = {ts '1970-01-01 00:00:00'}
> > union all
> > select j = j + 1 from AnotherTallyTableCTE where j < {ts '2007-01-01
> > 00:00:00'}
> > )
> > select *
> > into dbo.temp FROM
> > TallyTableCTE x
> > FULL OUTER JOIN --FULL OUTER JOIN with WHERE clause causes problem
> > --INNER JOIN --using INNER JOIN instead of FULL OUTER JOIN removes
> > problem
> > AnotherTallyTableCTE y
> > ON x.i = y.j
> > where x.i > {ts '1980-01-01 00:00:00'} --removing WHERE clause removes
> > problem
> > option (maxrecursion 30000)
>
> > IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
> > OBJECT_ID(N'dbo.temp'))
> > DROP TABLE dbo.temp
>
> > --------------------------------
>
> > On Jan 12, 11:14 am, "Emin" <emin.shop...@gmail.com> wrote:
> > > Dear Experts,
>
> > > I have a fairly simple query in which adding a where clause slows
> > > things down by at least a factor of 100. The following is the slow
> > > version of the query
>
> > > -------------------------
> > > SELECT * FROM
> > > ( Select x.event_date From x FULL OUTER JOIN y
> > > ON x.event_date = y.event_date
> > > ) innerQ
> > > WHERE ( innerQ.event_date >= {ts '1980-01-01 00:00:00'} )
> > > ------------------------
>
> > > Removing the where clause makes the query run quickly. This seems
> > > extremely strange because it seems like SQL Server should simply be
> > > able to take the results of innerQ and discard anything with a date
> > > that doesn't match. If I instead split the query into two pieces where
> > > I create a temp table and put innerQ into that and then do the select *
> > > WHERE (...) from the temp table things work fine.
>
> > > Any thoughts on what SQL Server might be doing to make things slow and
> > > how I can fix it?
>
> > > Thanks,
> > > -Emin
[Back to original message]
|