Reply to Re: Increasingly greater amounts of redundant aggregate data in queries? — MsSQL Server

Posted by csimam on 09/04/07 02:49

On Sep 4, 1:43 am, csi...@gmail.com wrote:
> The data I have, I want to generate totals and subtotals, say of
> sales: total sales per store, and subtotal of sales per store per day
> for example.
>
> The problem is that I ultimately only want the top 25 stores in total
> sales BUT with their subtotals as well... So I used ROLLUP to generate
> the totals and subtotals... and in the same query I generated a
> ranking of the data... and there I found my problem: the ranking does
> not distinguish between totals and subtotals. So the results look
> like:
>
> store1, realTotal, 1
> store1, subTotal, 2
> store2, realTotal, 3
> store7, realTotal, 4
> store10, subTotal, 5
>
> Clearly, the rows with subtotals should NOT be ranked... but I am
> struggling with SQL to figure out how to get exactly what I want.

Well... I have a solution. But I'm not sure about it. It involves
something I learned about only recently: a user-defined aggregate
function that concatenates values from across rows. Using this
aggregate function, stragg, I can do another pass through the results
mentioned above while grouping by store, and produce:

store1, 'realTotal subTotal', 1
store2, 'realTotal subTotal', 3
store7, 'realTotal subTotal', 4
store10, 'realTotal subTotal', 5

(it was a typo in the original example to list "store10, subTotal, 5"
- the realTotal would have been greater than or equal).

And another trick: use min( rank ) in the same aggregation, and order
by it as well.

Together that gives one a very compact result set that has everything
ranked and grouped perfectly.

....but... a worry: is this REALLY better than doing a nested query? I
wonder if the use of the user-defined aggregate function is very
expensive...

For the benefit of anyone wondering, here's the structure that
works... in one pass in theory... but again, I am now wondering if it
really is faster than the nested query equivalent that hits the
database twice on the table 'stores' (seems to me that the middle
select below is effectively hitting the table again, just in temporary
form):

select rownum as rank, id, days, totals
from
(
select id, stragg(day) as days, stragg(total) as totals,
min( rank ) as innerrank
from
(
select id, day, sum(sales) as total, rank() over (order by
sum(sales) desc) rank
from stores
group by id, rollup(day)
)
group by id
order by min(rank)
)
where rownum <= 25

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация