|
Posted by Ryan on 06/13/06 08:56
It's 130 basic indexes across all of the tables. Some of which are used
in the SP. I examined all of the tables to work out the minimum, then
we can start working on a proper strategy. Example table :
if exists (select * from dbo.sysobjects where id =
object_id(N'[dbo].[DEALER_SOURCE_DATA_VALUES]') and OBJECTPROPERTY(id,
N'IsUserTable') = 1)
drop table [dbo].[DEALER_SOURCE_DATA_VALUES]
GO
CREATE TABLE [dbo].[DEALER_SOURCE_DATA_VALUES] (
[DEALER_SOURCE_DATA_ID] [int] NOT NULL ,
[FIELD_CODE] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
NULL ,
[FIELD_VALUE] [numeric](15, 5) NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DEALER_SOURCE_DATA_VALUES] WITH NOCHECK ADD
CONSTRAINT [PK_DEALER_SOURCE_DATA_VALUES] PRIMARY KEY CLUSTERED
(
[DEALER_SOURCE_DATA_ID],
[FIELD_CODE]
) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
This table has approx 25 million rows. I would have added indexes to
Dealer_Source_Data_Id and Field Code seperately. This table holds the
data ID which can be related back to a row in another table for the
year, month and Dealer ID. It also holds all lines and the value
against those lines. Quite often queries would be needed for everything
by the Dealer_Source_Data_Id hence the first index I would add. Also,
we want to sum (for example) all of a specific field_code, hence the
other index I would add.
We would normally have approx 4,000 unique Field_Codes per
Dealer_Source_Data_Id.
The PK is correct, but not good for querying what we need to as it
would only be used when we specify both Dealer_Source_Data_Id and
Field_Code in order to return a value. Correct ? When I want more, it
would force a table scan if I'm correct. Proper indexing should allow
us to only need to scan 4,000 rows max using one index and 6,250 rows
using the other.
Ryan
Stu wrote:
> Hey Ryan,
>
> I can't think of any layman's resources for indexing strategies, but I
> know that Kalen Delaney's book "Inside SQL Server" has some very good
> explanations of how things work under the hood (including indexes);
> might be useful.
>
> However, I am a bit perplexed as to the need for 130+ indexes being
> affected by a single stored procedure; is this because of the redundant
> data structure per client? Can you post some sample table structures?
> It sounds as if the clustered index is actually a poor candidate given
> the volume of data and the amount of time to perform a bulk insert.
>
> Stu
>
>
> Ryan wrote:
> > I have a bit of a problem with regards an indexing strategy. Well,
> > basically there is no indexing strategy on a set of data I have at
> > work. Now, I didn't create the design as I would have allowed for this.
> >
> > OK, so there is primary key (clustered) indexes (mainly composite
> > keys), but no other indexes on the tables. As you would expect, the
> > performance leaves a lot to be desired. A hell of a lot. We have
> > several million rows in a lot of the tables. None of the queries seem
> > to be overly complex so we can work through the applications at a later
> > stage.
> >
> > We have multiple copies (one per client per country) of the same
> > structure (I may have considered combining these to allow better
> > performance). One specific SP that I need to run takes 40+ hours
> > without indexes and 5 hours with some (130) basic indexes to get us
> > started on a better design. These 130 indexes are the minimum I suspect
> > we need and from there, we can start to work out which ones we need.
> >
> > Now the test database (our worst performer) doubles in size to 20Gb,
> > but the performance is much much better (as expected). The original
> > thinking behind the design was for storage concerns (server space
> > recently upgraded) and for performance with bulk inserts.
> >
> > We have a lot of bulk inserts, but I suspect that these are not too
> > bad, and the time taken for indexing these is negligable due to the
> > performance gains. I strongly suspect that we have a considerable
> > amount of table scans going on, but the problem is that people here
> > don't understand indexing (yet) or have the time (probably because it's
> > all taken up using the poorly designed system). That's a whole seperate
> > issue for me to address.
> >
> > So, finally getting round to my questions...
> >
> > Is there any good reference explaining in Layman's terms why you need
> > basic (or advanced) indexing ? Any links would be appreciated. I need
> > this to help explain to colleagues why a disk space increase and
> > indexing will be far better than spending thousands on a new box and
> > doing the same (a common problem I suspect).
> >
> > How can I accurately estimate the amount of time taken to update an
> > index once data is bulk inserted. I don't want to re-index from scratch
> > as this may take too long. Indexing my database first time round takes
> > about 1 hour 30 minutes.
> >
> > It's all part of an ongoing bit of digging into the system and re-doing
> > it to make it work properly. I'm sure most of you will have been there
> > at some point or another.
> >
> > Thanks
> >
> >
> > Ryan
[Back to original message]
|