Reply to Re: Increasingly greater amounts of redundant aggregate data in queries? — MsSQL Server

Posted by Erland Sommarskog on 09/04/07 21:49

(csimam@gmail.com) writes:
>> The problem is that I ultimately only want the top 25 stores in total
>> sales BUT with their subtotals as well... So I used ROLLUP to generate
>> the totals and subtotals... and in the same query I generated a
>> ranking of the data... and there I found my problem: the ranking does
>> not distinguish between totals and subtotals. So the results look
>> like:
>>
>> store1, realTotal, 1
>> store1, subTotal, 2
>> store2, realTotal, 3
>> store7, realTotal, 4
>> store10, subTotal, 5
>>
>> Clearly, the rows with subtotals should NOT be ranked... but I am
>> struggling with SQL to figure out how to get exactly what I want.

An interesting challenge to do in one query and only hitting the table
once, and I was not able to think of a solution without using the
imaginary Previous() function that we have discussed internally among
us MVPs.

You can easily do in one query, if you hit the table twice, simply
one query with GROUP BY store and one with GROUP BY store, month.
Rank the first and join.

And it's possible to hit the table once, if you first get the
monthly aggregations into a temp table.

But in a single query? I don't think it's possible is you want the
format:

Store1, Grand Total, 1000000
Store1, Jan 2000, 12000
Store1, Feb 2000, 10000,
Store2, Grand Total, ....

But I will have to see if I get Itzik Ben-Gan to tackle this one.

> Well... I have a solution. But I'm not sure about it. It involves
> something I learned about only recently: a user-defined aggregate
> function that concatenates values from across rows. Using this
> aggregate function, stragg, I can do another pass through the results
> mentioned above while grouping by store, and produce:
>
> store1, 'realTotal subTotal', 1
> store2, 'realTotal subTotal', 3
> store7, 'realTotal subTotal', 4
> store10, 'realTotal subTotal', 5

One problem is that a UDA is limited to 8000 characters, so if there
are many subtotals, you query will bomb.

> ...but... a worry: is this REALLY better than doing a nested query? I
> wonder if the use of the user-defined aggregate function is very
> expensive...

The way to find out is to run both queries a couple of times on an
idle server. Just don't forget about the effects of caching. Either
first run the queries once, so that you know that table is entirely
in cache. If it's too big for that, run DBCC DROPCLEANBUFFERS between
each run, so that both queries run under the same condition.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

[Back to original message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация