|
Posted by Hugo Kornelis on 10/02/91 11:43
On 28 Mar 2006 12:06:31 -0800, JayCallas@hotmail.com wrote:
(snip)
>Proper development tactics says to try and do "code reuse".
Hi Jay,
This is not always true when dealing with databases. You have to weigh
the benefit of code reuse (cheaper maintenance) against the benefit of
code duplication (better performance).
In databases where performance matters, I won't hesitate a single second
to duplicate a stored procedure and change it slightly to optimize for
specific situations.
>* Different data from the same row needs to be inserted into multiple
>tables. "Common sense" (maybe "gut instinct" is better) says to handle
>each row as a "unit". Seems weird to process the entire set for one
>table, then to process the entire set AGAIN for another table, and then
>YET AGAIN for a third table, and so on.
I agree with Robert's reaction about the validitiy of such a design. But
if we assume that this really is valid, then you'll probablly find the
cost of the second, third, etc. access to the data faster than the first
access, because the data can all be fetched from cache and no physical
disk I/O is needed.
Of course, you can always test both versions against each other - I
think that you'll be hard-pressed to find a scenario where using a
cursor outperforms using the same SELECT statement two or three times in
a row.
>* Exception handling. Set based processing means that if one row fails
>the entire set fails. Looping through allows you to fail a row but
>allow everything else to be processed properly. It also allows you to
>gather statistics. (How many failed, how many worked, how many were
>skipped, etc.)
In an RDBMS, transactions should be atomic (the A in the ACID properties
of transactions) - they either succeed as a whole, or they fail as a
whole.
If you need to exclude rows that would violate a constraint, check the
constraint in the WHERE clause.
> The alternative is to create a temporary table (sandbox
>or workspace type thing), copy the data to there along with "status" or
>"valdation" columns, run through the set many times over looking for
>any rows that may fail, marking them as such,
No need to loop over the data many times. In most cases, you only need a
single UPDATE with a CASE to check the variuous constraints and set the
"status" column accordingly.
> and then at the end only
>dealing with those rows which "passed" the testing. Of course, in order
>for this to work you must know (and duplicate) all constraints so you
>know what to look for in your testing.
Yes. You need to know your constraints.
Isn't that a normal part of your job?
--
Hugo Kornelis, SQL Server MVP
[Back to original message]
|