|  | Posted by Erland Sommarskog on 09/04/07 21:49 
(csimam@gmail.com) writes:>> The problem is that I ultimately only want the top 25 stores in total
 >> sales BUT with their subtotals as well... So I used ROLLUP to generate
 >> the totals and subtotals... and in the same query I generated a
 >> ranking of the data... and there I found my problem: the ranking does
 >> not distinguish between totals and subtotals. So the results look
 >> like:
 >>
 >> store1, realTotal, 1
 >> store1, subTotal, 2
 >> store2, realTotal, 3
 >> store7, realTotal, 4
 >> store10, subTotal, 5
 >>
 >> Clearly, the rows with subtotals should NOT be ranked... but I am
 >> struggling with SQL to figure out how to get exactly what I want.
 
 An interesting challenge to do in one query and only hitting the table
 once, and I was not able to think of a solution without using the
 imaginary Previous() function that we have discussed internally among
 us MVPs.
 
 You can easily do in one query, if you hit the table twice, simply
 one query with GROUP BY store and one with GROUP BY store, month.
 Rank the first and join.
 
 And it's possible to hit the table once, if you first get the
 monthly aggregations into a temp table.
 
 But in a single query? I don't think it's possible is you want the
 format:
 
 Store1, Grand Total, 1000000
 Store1, Jan 2000, 12000
 Store1, Feb 2000, 10000,
 Store2, Grand Total, ....
 
 But I will have to see if I get Itzik Ben-Gan to tackle this one.
 
 > Well... I have a solution. But I'm not sure about it. It involves
 > something I learned about only recently: a user-defined aggregate
 > function that concatenates values from across rows. Using this
 > aggregate function, stragg, I can do another pass through the results
 > mentioned above while grouping by store, and produce:
 >
 > store1, 'realTotal subTotal', 1
 > store2, 'realTotal subTotal', 3
 > store7, 'realTotal subTotal', 4
 > store10, 'realTotal subTotal', 5
 
 One problem is that a UDA is limited to 8000 characters, so if there
 are many subtotals, you query will bomb.
 
 > ...but... a worry: is this REALLY better than doing a nested query? I
 > wonder if the use of the user-defined aggregate function is very
 > expensive...
 
 The way to find out is to run both queries a couple of times on an
 idle server. Just don't forget about the effects of caching. Either
 first run the queries once, so that you know that table is entirely
 in cache. If it's too big for that, run DBCC DROPCLEANBUFFERS between
 each run, so that both queries run under the same condition.
 
 
 --
 Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
 
 Books Online for SQL Server 2005 at
 http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
 Books Online for SQL Server 2000 at
 http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 [Back to original message] |