You are here: Re: Is ROLLUP really better than doing a nested query when rank is involved? « MsSQL Server « IT news, forums, messages
Re: Is ROLLUP really better than doing a nested query when rank is involved?

Posted by Jason Lepack on 09/04/07 21:09

What does the query plan show for the results of these two queries?
you keep talking about somethat should, in theory be correct, but why
not check if it ACTUALLY does this in reality... That would the
easiest way to compare the queries, because that's where it matters.

On Sep 3, 11:06 pm, csi...@gmail.com wrote:
> Here is a thread I posted elsewhere that I wonder if someone here
> could shed some light on...
>
> The original question I had:
>
> --- BEGIN ---
> Which one is better?
> 1. To perform aggregate queries on data, and use that data to home in
> and perform more detailed analysis? This gives result sets that are
> potentially filled with columns that contain the aggregate data
> repeated across many rows.
> I meant here by homing in: within the same query - using
> nested SELECTS. For example, the inner query performs the aggregate
> analysis, and returns whatever data we are interested in, and then
> the
> outer query performs another SELECT on the same data, resulting in
> the
> aggregate data being distributed across 'detailed results' from the
> outer query.
>
> 2. To perform the aggregate query once, and then hit the database
> again for a detailed analysis, and return the detailed analysis
> results in an order that is easily cross-correlated with the original
> aggregate data? This means running the aggregate query essentially
> twice, just the second time it is hidden as a subquery.
> --- END ---
>
> I realize the question was vague, but a super DBA understood what I
> was asking and suggested I study up on ROLLUP and CUBE (I am not a DBA
> by the way). So I did... and here is what I found:
>
> --- BEGIN ---
> The data I have, I want to generate totals and subtotals, say of
> sales: total sales per store, and subtotal of sales per store per day
> for example.
>
> The problem is that I ultimately only want the top 25 stores in total
> sales BUT with their subtotals as well... So I used ROLLUP to
> generate
> the totals and subtotals... and in the same query I generated a
> ranking of the data... and there I found my problem: the ranking does
> not distinguish between totals and subtotals. So the results look
> like:
>
> store1, realTotal, 1
> store1, subTotal, 2
> store2, realTotal, 3
> store7, realTotal, 4
> store10, realTotal, 5
> store10, subTotal, 6
>
> Clearly, the rows with subtotals should NOT be ranked... but I am
> struggling with SQL to figure out how to get exactly what I want.
>
> Well... I have a solution. But I'm not sure about it. It involves
> something I learned about only recently: a user-defined aggregate
> function that concatenates values from across rows. Using this
> aggregate function, stragg, I can do another pass through the results
> mentioned above while grouping by store, and produce:
>
> store1, 'realTotal subTotal', 1
> store2, 'realTotal subTotal', 3
> store7, 'realTotal subTotal', 4
> store10, 'realTotal subTotal', 5
>
> And another trick: use min( rank ) in the same aggregation, and order
> by it as well.
>
> Together that gives one a very compact result set that has everything
> ranked and grouped perfectly.
>
> ...but... a worry: is this REALLY better than doing a nested query? I
> wonder if the use of the user-defined aggregate function is very
> expensive...
>
> For the benefit of anyone wondering, here's the structure that
> works... in one pass in theory... but again, I am now wondering if it
> really is faster than the nested query equivalent that hits the
> database twice on the table 'stores' (seems to me that the middle
> select below is effectively hitting the table again, just in
> temporary
> form):
>
> select rownum as rank, id, days, totals
> from
> (
> select id, stragg(day) as days, stragg(total) as totals,
> min( rank ) as innerrank
> from
> (
> select id, day, sum(sales) as total, rank() over (order by
> sum(sales) desc) rank
> from stores
> group by id, rollup(day)
> )
> group by id
> order by min(rank)
> )
> where rownum <= 25
>
> --- END ---
>
> Any comments? Is this query really better than the following:
>
> -- get totals by day
> select rank, id, day, total, sum(sales) as dailytotal
> from
> (
> select * from
> (
> -- get grand totals regardless of day
> select id as storeid, sum( sales ) as total,
> rank() over (order by sum(sales) desc) rank
> from stores
> group by id
> )
> where rank <= 25
> ), stores
> -- NOTE join with stores again
> where id = storeid
> -- but does the where condition above clip the query and make it
> efficient?
> group by rank, id, day, total
> order by rank, day

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация