|
Posted by Stu on 10/02/00 11:45
I'm reluctant to give too many details about our structure for fear
that I'd be compromising our business model; I can say that our primary
application is a data warehouse structure that involves several million
rows of data per day. Data is loaded in small batches (every minute,
24 X7), and we do our own pattern seeking against the new data (kind of
a home-grown analysis services).
Our batches can range in size from 1 row to 20,000 rows, depending on
the time of day, and the nature of the data. We host both a raw
database (involving a very verbose but simple OLTP structure) and the
data warehouse on the same box.
Some of the things we do to lessen the bottleneck on the server
include:
1. distribute the ETL process as much as possible. We have several
little bots that run on various servers that handle loading, analysis,
and grouping off the main server. We used to use DTS quite heavily;
we're transitioning away from that.
2. Use appropriate locking hints. We write all sorts of code involving
NOLOCK and temp tables to prefetch the data.
3. Make the most of our physical structure. We use filegroups to
seperate indexes from tables, and have isolated our busiest databases
from each other on seperate drive arrays.
4. We always cluster on monotonically-increasing values (such as dates
or date representation) so that page splits are minimal. We also use
partitioned views (although they are a bit of a bear to maintain).
HTH,Stu
Navigation:
[Reply to this message]
|