|
Posted by Erland Sommarskog on 07/17/07 21:37
M Bourgon (bourgon@gmail.com) writes:
> Sure. We have a large table that constantly has new records added. We
> want to be able to go back historically and pull from this data set.
> The way we currently do it is to keep several days (the "hot" data)
> available in one table (whilst auto-archiving 1 days' data out to a
> table on a daily basis) several weeks available by a partitioned view,
> and archive the older tables. This way, when we get requests for
> older data, we can easily look in a set of tables for the results. We
> split it up by day due to volume, and also because most of our
> requests are "this subset of data, from date A to date B, further
> filtered".
Maybe a very naïve and silly question, but what about a clustered index
on the big table? If you have a date range and clustered index to match
that range, it's fairly irrelevant if the table has 500 million rows.
The major reasons to partition a table I know of are:
1) Being able to quickly drop old data or add new data, by shifting
a table out or in.
2) Spread the load over different file groups.
But it sounds that you in your case keep the data, so that reason to
partition is out.
> We've discussed different ways of keeping the data available, as what
> takes up the least amount of space is not necessarily the easiest to
> query. I like the idea of keeping the data in a database because then
> we can easily query it, and we don't have to worry about any issues
> (i.e. if we save it out, and the table format changes, we're don't run
> into issues months from now when we try to load the data).
But if you keep the data in the database, why then BCP?
> One question you'll probably ask - how do you create the one-days-
> worth-of-data table? Unfortunately, that's just a dumb script also,
> destined to break if we wind up changing the format of the table.
I don't see that much of a problem. I would not expect frequent schema
changes to a table of this size. Having to update one script extra
when you actually do is not that big deal. Although for a plain copy,
you could use SELECT INTO. That would not give the constraints,
triggers and indexes though.
What I am a little more curious is what happens to all those daily
tables that all of a sudden has an obsolete definition.
>> If the main purpose is that the import of the BCP in native format will
>> work, maybe it sufficient to save the format file with the table? You
>> can create a format file from BCP with the format option. (You use
>> "format" in place of "in" or "out".)
>
> Can the format file be used instead of DDL? I'll have to try it.
No, format file has nothing to do with DLL. But my thinking was that
if you saved the BCP file, and then want to import three months later,
the format files relates the format of the file. Assuming that you only
add new columns at the end not drop any, it would import out of the box.
Else you would have to edit column-mapping in the format file.
If you want to script the table, the you are probably best off with DMO
on SQL 2000 and SMO on SQL 2005. I have not worked with either, so I can't
help.
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Navigation:
[Reply to this message]
|