|
Posted by Erland Sommarskog on 10/31/07 23:03
halftime57@gmail.com (halftime57@gmail.com) writes:
> I have a very large DB (>300GB). Each day I receive a file containing
> an average of 2 million individual delete statements. The delete
> statements all delete based on full primary key so only an @@rowcount
> value of 0 or 1 is possible for each statement. The deletes can be
> from any one of several hundred tables in the database.
>
> Right now I'm just cursoring through that collection of delete
> statements and executing them one at a time as dynamic sql (since I
> get the whole SQL statement I don't know of another way to do it). It
> takes 1-2 hours to complete, depending on other traffic on the host.
> Batching them into a single trx gives me better performance, but I
> lose the ability to know which statement within the transaction failed
> if any in the batch do, or which ones come back with a zero row count
> (i.e., the row wasn't there, which is information I need to capture).
> I keep thinking there's an easy, elegant solution, but it's eluding me
> so I thought I'd ask. My undying gratitude to anyone who can help.
As David said, the best would be to change the source that emits
this beast.
One thing that worries me when you talk about transactions, is that some
deletes may be related. Say for instance that there are DELETE statements
to delete an old order and the order lines. What if for some reason
only some of the order lines are deleted?
If the process that generates all these statements cannot be changed,
the only way out I can see is to parse them and build statements
that delete many rows at a time. Then again, if you need to capture
that some DELETES did not hit any rows, you would also have to
rewrite them in to SELECTs to see which rows that actually are in
the database. Depending on how the file looks like, there could be
quite some development cost for this.
As for the transaction length, I think you could take a middle path and
delete say 100 at a time. But as I think of it, it becomes quite messy.
To make it easy you could submit one statement at a time and capture
row count, and trap any error. But the errors can be of different kind.
If it's an FK violation, you can just chalk up one error and move on.
But if the error is a deadlock, you lost the previous deletes in the
batch and need to redo them. This can be managed, but the level of
complexity becomes high.
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
[Back to original message]
|