| 
 | 
	Query fun and games 
		Date: 02/23/12 
		(SQL Server)    Keywords: xml, sql
 
I've found in general for SQL that there is more than one way to solve (almost) any problem.  I've been playing around with query building today and decided to see how many ways I could solve a problem that recurs fairly frequently in my work, flattening subrecords into a single row.
  This is my current standard solution, using the PIVOT function.  It's quite fast, but limits you to a specific number of subrecords--it can be a high number, but you still have to decide on a maximum.
  WITH cte AS (SELECT Person.contactid AS 'ID' , Person.FullName AS 'Name' 	, 'Activity' = Activity.a422_rel_activityvalueidname 	, 'Row' = ROW_NUMBER() OVER (PARTITION BY Person.contactid, Person.FullName ORDER BY Activity.a422_rel_activityvalueidname) 	FROM Contact AS Person 		INNER JOIN Task AS Activity ON Person.contactid = Activity.regardingobjectid) SELECT ID, Name 	, 'Activity1' = [1], 'Activity2' = [2], 'Activity3' = [3], 'Activity4' = [4], 'Activity5' = [5] FROM cte PIVOT (MAX(cte.Activity) FOR cte.[Row] IN ([1], [2], [3], [4], [5])) AS pvt
 
  This is a new solution I found in surfing some SQL Server blogs, using FOR XML PATH to create a CSV list of values.  It will include an indefinite number of subrecords, but only includes one field from the subrecords.  It's significantly slower than the first example by at least an order of ten.
 SELECT DISTINCT p.contactid AS 'ID' , p.FullName AS 'Name' 	, SUBSTRING((SELECT ', ' + Activity.a422_rel_activityvalueidname  		FROM task AS Activity  		WHERE Activity.regardingobjectid = p.contactid  		FOR XML PATH('')), 2, 4000) AS 'Activities' FROM Contact AS p 	INNER JOIN Task AS t ON p.contactid = t.regardingobjectid ORDER BY p.contactid
 
  This ugly looking creature is what I used to use before PIVOT came along, using many, many multiple self-joins.  I'm pretty sure I had a slightly more elegant (and faster!) version of this, but it's been a long time since I've had to create one of these things (fortunately).  The performance is...not as bad as you might expect.
 SELECT 'ID' = p.contactid, 'Name' = p.fullname 	, 'Activity1' = a1.a422_rel_activityvalueidname 	, 'ActivityDate1' = a1.actualend 	, 'Activity2' = a2.a422_rel_activityvalueidname 	, 'ActivityDate2' = a2.actualend 	, 'Activity3' = a3.a422_rel_activityvalueidname 	, 'ActivityDate3' = a3.actualend 	, 'Activity4' = a4.a422_rel_activityvalueidname 	, 'ActivityDate4' = a4.actualend 	, 'Activity5' = a5.a422_rel_activityvalueidname 	, 'ActivityDate5' = a5.actualend FROM Contact AS p 	INNER JOIN Task AS a1  		ON p.contactid = a1.regardingobjectid 	LEFT JOIN Task AS not1  		ON p.contactid = not1.regardingobjectid  		AND not1.activityid < a1.activityid 	LEFT JOIN Task AS a2 		ON p.contactid = a2.regardingobjectid 		AND a2.activityid > a1.activityid 	LEFT JOIN Task AS not2  		ON p.contactid = not2.regardingobjectid  		AND not2.activityid > a1.activityid 		AND not2.activityid < a2.activityid 	LEFT JOIN Task AS a3 		ON p.contactid = a3.regardingobjectid 		AND a3.activityid > a2.activityid 	LEFT JOIN Task AS not3  		ON p.contactid = not3.regardingobjectid  		AND not3.activityid > a2.activityid 		AND not3.activityid < a3.activityid 	LEFT JOIN Task AS a4 		ON p.contactid = a4.regardingobjectid 		AND a4.activityid > a3.activityid 	LEFT JOIN Task AS not4  		ON p.contactid = not4.regardingobjectid  		AND not4.activityid > a3.activityid 		AND not4.activityid < a4.activityid 	LEFT JOIN Task AS a5 		ON p.contactid = a5.regardingobjectid 		AND a5.activityid > a4.activityid 	LEFT JOIN Task AS not5  		ON p.contactid = not5.regardingobjectid  		AND not5.activityid > a4.activityid 		AND not5.activityid < a5.activityid WHERE not1.regardingobjectid Is Null 	AND not2.regardingobjectid Is Null 	AND not3.regardingobjectid Is Null 	AND not4.regardingobjectid Is Null 	AND not5.regardingobjectid Is Null
 
  Using a recursive CTE almost works, except that for each main record it gives a row with one subrecord, another row with two subrecords, a row with three subrecords, and so on for as many subrecords as are available for that main record.  It seems like there has to be a way around that, so if you have any ideas, let me know.  Performance is not good, not horrible.
 WITH cte AS (SELECT a1.regardingobjectid, a1.activityid 		, 'Activities' = CONVERT(nvarchar(1000), a1.createdon, 113) 	FROM Task AS a1 		INNER JOIN Contact AS p  		ON a1.regardingobjectid = p.contactid 		LEFT JOIN Task AS not1 		ON a1.regardingobjectid = not1.regardingobjectid 		AND a1.activityid > not1.activityid 	WHERE not1.activityid Is Null 	UNION ALL 		SELECT cte.regardingobjectid, a1.activityid 		, 'Activities' = CONVERT(nvarchar(1000), (cte.Activities + N', ' + CONVERT(nvarchar, a1.createdon, 113))) 		FROM cte  			INNER JOIN Task AS a1  			ON cte.regardingobjectid = a1.regardingobjectid 			AND cte.activityid < a1.activityid 		WHERE NOT EXISTS (SELECT *  			FROM Task AS not1 			WHERE cte.regardingobjectid = not1.regardingobjectid 			AND not1.activityid > cte.activityid  			AND not1.activityid < a1.activityid) 	) SELECT 'ID' = p.contactid, 'Name' = p.fullname 	, cte.Activities FROM cte 	INNER JOIN Contact AS p  	ON cte.regardingobjectid = p.contactid ORDER BY p.fullname
 
  Creating a custom aggregate function in CLR is another solution, but playing with that will have to be another day.
Source: https://sqlserver.livejournal.com/76055.html  	
 |