|
Posted by Alex Kuznetsov on 07/23/07 22:22
On Jul 23, 9:03 am, "bobdu...@gmail.com" <bobdu...@gmail.com> wrote:
> On Jul 20, 5:50 pm, Erland Sommarskog <esq...@sommarskog.se> wrote:
>
>
>
> > bobdu...@gmail.com (bobdu...@gmail.com) writes:
> > > Alex, i will go back and re-read those articles, but we've implemented
> > > as much as we can from them. As for the clustering of the tables
> > > based on the timestamps, thats something i'm totally unfamiliar
> > > with!!! Our schema is created straight from java with as much
> > > database agnostic code as possible, could you perhaps hint at how to
> > > do this for a noob like me?
>
> > CREATE CLUSTERED INDEX CreateDateIX ON loads(CreateDate)
>
> > and change this to:
>
> > LoadID BIGINT NOT NULL IDENTITY PRIMARY KEY NONCLUSTERED,
>
> > > The insert looks likes this:
> > > exec sp_executesql N'INSERT INTO loads (Type, CreateDate, OtherDate,
> > > ObjectID) VALUES ( @P0 , @P1 , @P2 , @P3 )',N'@P0 int,@P1
> > > datetime,@P2 datetime,@P3 int',2,''2007-07-20
> > > 09:38:15:477'',NULL,10
>
> > And CreateDate is typically now?
>
> > > Heres a purge:
> > > declare @p1 int
> > > set @p1=2
> > > exec sp_prepare @p1 output,N'',N'DELETE FROM loads WHERE CreateDate <
> > > dateadd(SECOND, -40, CURRENT_TIMESTAMP)',1
> > > select @p1
>
> > 40 seconds? Egads, why do you save the data at all? :-)
>
> > There is no index at all on CreateDate, so it will have to scan the
> > full table. It's surprising that you need that many concurrent processes
> > for it to happen.
>
> > If CreateDate is the time for insertion and never changed before the
> > purge, it would be a good idea to cluster on that column. That would
> > minimize the point of conflict between the DELETE and the INSERT (but
> > the non-clustered indexes could still be a conflict area.)
>
> > But the SELECT COUNT(*) and the "normal" SELECT is still problematic.
> > Particularly the normal SELECT looks like it would table scan to me
> > since the WHERE clause is "Type >= @P0". But that depends on the
> > values actually used and the distribution in the table.
>
> > You could avoid this if you use snapshot isolation, but that is only
> > available to SQL 2005.
>
> > I don't really know what Alex meant that they set their deadlock priority
> > to high. What I meant is that the purging process sould have SET
> > DEADLOCK_PRIORITY LOW to declare itself a voluntary victim. In any
> > case, on SQL 2000 the only deadlock priorties are LOW and NORMAL. SQL
> > 2005 also has HIGH and numbered priorities.
>
> > --
> > Erland Sommarskog, SQL Server MVP, esq...@sommarskog.se
>
> > Books Online for SQL Server 2005 athttp://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books...
> > Books Online for SQL Server 2000 athttp://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>
> Thanks again for the thoughts, a few followups:
> - CreateDate is typically the time the record was created, so now
> yes.
> - 40 seconds is just a sample, the date typically goes between 2 hours
> and 2 weeks after the record was created.
> - Sorry again, the type >= is just a way of getting more data, those
> queries are typically = based, and theres never more than 20 unique
> values for type.
>
> I probably should've used this analogy before, but consider these
> tables like "audit" tables. We simply record lots of audits, do funky
> searches on these audits, and then purge these audits at a given
> schedule when they're no longer necessary.
>
> We can try the clustered index, but i'm NOT convinced why this is
> going to resolve the page lock issue... i've checked the execution
> plan for some of our queries and it looks like it sorts first, then
> does a clustered index scan of the ID. Is this really going to have
> more page locks than if the index was on the date?
>
> About the deadlock priority - i don't think it would be appropriate if
> the purge job was the one thats failing, mainly because if it starts
> getting behind in its work it will probably just have more and more
> difficulty purging the future... the guys doing the inserts/selects
> while important, can fail and will retry again later, but its ugly.
> I'm (obviously) no dba and am finding it difficult that sql server can
> kill clients because it's discovered a deadlock, when just looking at
> the competing sql could never lead me to that conclusion.
>
> One final thing that i don't understand at the db level is transaciton
> models - using jdbc we're normally using READ_COMMITTED - we could use
> SERIALIZABLE but i'm not sure how either of these translates at the db
> level. If someone has a pointer to this info with respect to SQL
> server (and/or oracle) i'd love to hear it.
>
> Thanks a million again!!!
I described how snapshot isolation affects data purging here:
http://www.devx.com/dbzone/Article/32957
Alex Kuznetsov, SQL Server MVP
http://sqlserver-tips.blogspot.com/
Navigation:
[Reply to this message]
|