|
Posted by Steve on 02/09/07 01:27
On Feb 8, 4:05 pm, "Merennulli" <mar...@sdf.lonestar.org> wrote:
> To start with, I'll give a simplified overview of my data.
>
> BaseRecord (4mil rows, 25k in each Region)
> ID | Name | Region | etc
>
> OtherData (7.5mil rows, 1 or 2 per ID)
> ID | Type(1/2) | Data
>
> ProblemTable (4mil rows)
> ID | ConcatenatedHistory
>
> The concatenated history field is a nvarchar with up to 20 different
> pipe delimited date/code combinations, eg. '01/01/2007X|11/28/2006Q|
> 11/12/2004Q|'
>
> Using left outer joins (all from base, the rest optional) I've got a
> view something like:
>
> View (4mil rows)
> ID | Name | Region | etc | Data | Data2 | ConcatenatedHistory
>
> Querying it, it takes about 15-20 seconds to do this:
> Select ID, Name, Region, etc, Data, Data2, ConcatenatedHistory>From View
>
> Where Region = 58
> and ConcatenatedHistory like '%11/28/2006%' or ConcatenatedHistory
> like '%2007X%' ;
>
> Or to do this:
> Select ID, Name, Region, etc, Data, Data2, ConcatenatedHistory>From View
>
> Where Region = 58;
>
> But this takes over a minute:
> Select ID, Name, Region, etc, Data, Data2, ConcatenatedHistory>From View
>
> Where Region = 58
> and ConcatenatedHistory like '%test%' ;
>
> What puzzles me most is that it's taking longer to return nothing.
>
> I've tried normalizing this concatenated field into it's own table, or
> into 20 and 40 denormalized fields. The denormalized fields were
> nightmarishly long queries for a web interface at 5-6 minutes.
>
> The normalized table should have roughly 25mil records, but cutting it
> down to just the most relevant years let me play with it at 9.5mil
> records. This shifted the results to where it took 35-40 seconds to do
> ANY query against that table.
>
> Select View.ID, View.Name, View.Region, View.etc, View.Data,
> View.Data2, History.Date, History.Code>From View inner join History on History.ID = View.ID
>
> Where View.Region = 58
> and History.Date = '11/28/2006' or History.Code = '2007X';
>
> I also tried reducing this table down to a linking table between ID
> and Code, and pushing the date off to another table, but that only
> made things worse.
>
> ~~~
> Going back to what worked best (the intial View), the Execution Plan
> shows 93% on a Clustered Index Scan on the ID field's index in the
> ConcatenatedHistory table for the problem query, but spreads out the
> load fairly evenly among indexes on the successful query. I'm trying
> to figure out a way to improve performance, and more importantly, make
> "0 records found" responses be a bit more forthcoming.
>
> If it's relevant, I'm on SQL Server 2005 Standard, and I've already
> taken care of the memory, CPU and drive optimization.
do you query the concatenated data, "normalized", in a way that you
could use FULL TEXT ?
Navigation:
[Reply to this message]
|