|
Posted by jim_geissman on 02/24/06 21:29
I have some queries that involve subqueries to select the appropriate
record from a bunch of candidates. For example the following, which
selects the most recent transaction for a given customer:
CREATE TABLE #Customer (CustID int, OtherInfo int)
INSERT #Customer SELECT 1,7
INSERT #Customer SELECT 2,8
INSERT #Customer SELECT 3,9
CREATE TABLE #Event (CustID int, EventID int, EventAmt int, EventDt
smalldatetime)
INSERT #Event SELECT 1,1,1000,'20060224'
INSERT #Event SELECT 2,1,2000,'20060224'
INSERT #Event SELECT 3,2,3000,'20060224'
INSERT #Event SELECT 3,1,5000,'20060225'
SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c JOIN #Event e ON c.CustID=e.CustID
WHERE EventDt = (SELECT MAX(EventDt) FROM #Event WHERE CustID=c.CustID)
ORDER BY c.CustID
Over millions of customers and events, this takes forever. Creating a
temp table which identifies the appropriate dates, and then joining to
that, speeds things up considerably -- the following two queries
together take a lot less time the the one above.
CREATE TABLE #Temp (CustID int,EventDt smalldatetime)
INSERT #Temp SELECT CustID,MAX(EventDt)FROM #Event GROUP BY CustID
SELECT c.CustID,c.OtherInfo ,e.EventAmt,e.EventDt
FROM #Customer c
JOIN #Temp t ON t.CustID=c.CustID
JOIN #Event e ON c.CustID=e.CustID AND t.EventDt=e.EventDt
This seems pretty simple, and I would assume the query planner should
be able to figure it out without assistance. Is there a better way to
forumulate the original query? BTW, as far as I can tell the indexes
are appropriate for these queries; however the subqueries seem to be
done one at a time.
[Back to original message]
|