|
Posted by bobdurie@gmail.com on 07/23/07 14:03
On Jul 20, 5:50 pm, Erland Sommarskog <esq...@sommarskog.se> wrote:
> bobdu...@gmail.com (bobdu...@gmail.com) writes:
> > Alex, i will go back and re-read those articles, but we've implemented
> > as much as we can from them. As for the clustering of the tables
> > based on the timestamps, thats something i'm totally unfamiliar
> > with!!! Our schema is created straight from java with as much
> > database agnostic code as possible, could you perhaps hint at how to
> > do this for a noob like me?
>
> CREATE CLUSTERED INDEX CreateDateIX ON loads(CreateDate)
>
> and change this to:
>
> LoadID BIGINT NOT NULL IDENTITY PRIMARY KEY NONCLUSTERED,
>
> > The insert looks likes this:
> > exec sp_executesql N'INSERT INTO loads (Type, CreateDate, OtherDate,
> > ObjectID) VALUES ( @P0 , @P1 , @P2 , @P3 )',N'@P0 int,@P1
> > datetime,@P2 datetime,@P3 int',2,''2007-07-20
> > 09:38:15:477'',NULL,10
>
> And CreateDate is typically now?
>
> > Heres a purge:
> > declare @p1 int
> > set @p1=2
> > exec sp_prepare @p1 output,N'',N'DELETE FROM loads WHERE CreateDate <
> > dateadd(SECOND, -40, CURRENT_TIMESTAMP)',1
> > select @p1
>
> 40 seconds? Egads, why do you save the data at all? :-)
>
> There is no index at all on CreateDate, so it will have to scan the
> full table. It's surprising that you need that many concurrent processes
> for it to happen.
>
> If CreateDate is the time for insertion and never changed before the
> purge, it would be a good idea to cluster on that column. That would
> minimize the point of conflict between the DELETE and the INSERT (but
> the non-clustered indexes could still be a conflict area.)
>
> But the SELECT COUNT(*) and the "normal" SELECT is still problematic.
> Particularly the normal SELECT looks like it would table scan to me
> since the WHERE clause is "Type >= @P0". But that depends on the
> values actually used and the distribution in the table.
>
> You could avoid this if you use snapshot isolation, but that is only
> available to SQL 2005.
>
> I don't really know what Alex meant that they set their deadlock priority
> to high. What I meant is that the purging process sould have SET
> DEADLOCK_PRIORITY LOW to declare itself a voluntary victim. In any
> case, on SQL 2000 the only deadlock priorties are LOW and NORMAL. SQL
> 2005 also has HIGH and numbered priorities.
>
> --
> Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se
>
> Books Online for SQL Server 2005 athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books...
> Books Online for SQL Server 2000 athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Thanks again for the thoughts, a few followups:
- CreateDate is typically the time the record was created, so now
yes.
- 40 seconds is just a sample, the date typically goes between 2 hours
and 2 weeks after the record was created.
- Sorry again, the type >= is just a way of getting more data, those
queries are typically = based, and theres never more than 20 unique
values for type.
I probably should've used this analogy before, but consider these
tables like "audit" tables. We simply record lots of audits, do funky
searches on these audits, and then purge these audits at a given
schedule when they're no longer necessary.
We can try the clustered index, but i'm NOT convinced why this is
going to resolve the page lock issue... i've checked the execution
plan for some of our queries and it looks like it sorts first, then
does a clustered index scan of the ID. Is this really going to have
more page locks than if the index was on the date?
About the deadlock priority - i don't think it would be appropriate if
the purge job was the one thats failing, mainly because if it starts
getting behind in its work it will probably just have more and more
difficulty purging the future... the guys doing the inserts/selects
while important, can fail and will retry again later, but its ugly.
I'm (obviously) no dba and am finding it difficult that sql server can
kill clients because it's discovered a deadlock, when just looking at
the competing sql could never lead me to that conclusion.
One final thing that i don't understand at the db level is transaciton
models - using jdbc we're normally using READ_COMMITTED - we could use
SERIALIZABLE but i'm not sure how either of these translates at the db
level. If someone has a pointer to this info with respect to SQL
server (and/or oracle) i'd love to hear it.
Thanks a million again!!!
Navigation:
[Reply to this message]
|