|
Posted by Merennulli on 02/09/07 00:05
To start with, I'll give a simplified overview of my data.
BaseRecord (4mil rows, 25k in each Region)
ID | Name | Region | etc
OtherData (7.5mil rows, 1 or 2 per ID)
ID | Type(1/2) | Data
ProblemTable (4mil rows)
ID | ConcatenatedHistory
The concatenated history field is a nvarchar with up to 20 different
pipe delimited date/code combinations, eg. '01/01/2007X|11/28/2006Q|
11/12/2004Q|'
Using left outer joins (all from base, the rest optional) I've got a
view something like:
View (4mil rows)
ID | Name | Region | etc | Data | Data2 | ConcatenatedHistory
Querying it, it takes about 15-20 seconds to do this:
Select ID, Name, Region, etc, Data, Data2, ConcatenatedHistory
>From View
Where Region = 58
and ConcatenatedHistory like '%11/28/2006%' or ConcatenatedHistory
like '%2007X%' ;
Or to do this:
Select ID, Name, Region, etc, Data, Data2, ConcatenatedHistory
>From View
Where Region = 58;
But this takes over a minute:
Select ID, Name, Region, etc, Data, Data2, ConcatenatedHistory
>From View
Where Region = 58
and ConcatenatedHistory like '%test%' ;
What puzzles me most is that it's taking longer to return nothing.
I've tried normalizing this concatenated field into it's own table, or
into 20 and 40 denormalized fields. The denormalized fields were
nightmarishly long queries for a web interface at 5-6 minutes.
The normalized table should have roughly 25mil records, but cutting it
down to just the most relevant years let me play with it at 9.5mil
records. This shifted the results to where it took 35-40 seconds to do
ANY query against that table.
Select View.ID, View.Name, View.Region, View.etc, View.Data,
View.Data2, History.Date, History.Code
>From View inner join History on History.ID = View.ID
Where View.Region = 58
and History.Date = '11/28/2006' or History.Code = '2007X';
I also tried reducing this table down to a linking table between ID
and Code, and pushing the date off to another table, but that only
made things worse.
~~~
Going back to what worked best (the intial View), the Execution Plan
shows 93% on a Clustered Index Scan on the ID field's index in the
ConcatenatedHistory table for the problem query, but spreads out the
load fairly evenly among indexes on the successful query. I'm trying
to figure out a way to improve performance, and more importantly, make
"0 records found" responses be a bit more forthcoming.
If it's relevant, I'm on SQL Server 2005 Standard, and I've already
taken care of the memory, CPU and drive optimization.
[Back to original message]
|