|
Posted by Wes Groleau on 02/02/07 23:09
I was doing update statements in SQL Server 2000.
I have a table with over 16 million rows.
It came from several hundred delimited text files,
and two of the columns are file ID (int) and Line # (int)
Structure is X12 (835). For those unfamiliar with that,
each file has one to many BPR lines; each BPR line has
zero to many CLP lines, each of those has zero to many
SVC lines, each of those has zero to many CAS lines.
Working with this through the Enterprise Manager MMC,
a lot of things I tried got timeouts.
So, I indexed File ID, Line number, and line type, and
created a new table containing only the columns I knew
I would need in the final output--selected fields from
some of the line types mentioned, plus the line numbers
and common file ID for those rows.
I indexed every column in that table that I thought I might
search on.
I loaded it with 31 thousand rows using a select on a
subset of the CAS rows. That took far less than a minute.
I updated each row with the highest BPR line number not higher
than the CASE line number. About a minute. Not bad, with having
the worst case number of comparisons being 16 million times 31 thousand.
Of course, the indexing should help plus it can be narrowed down by
the "same file" and BPR # < CAS # criteria.
But the next update should theoretically be faster: each row now has
a BPR # and a CAS # and I am telling it to find the highest CLP number
BETWEEN those two. So it should have a MUCH smaller set of to search
through. Yet it thinks for about five minutes and then announces a timeout.
Any suggestions?
--
Wes Groleau
Measure with a micrometer, mark with chalk, and cut with an axe.
[Back to original message]
|