|
Posted by mootmail-googlegroups on 07/26/06 20:10
Alvaro G. Vicario wrote:
> I need to write an app that dumps certain fields and tables from LOCAL_DB
> into INET_DB every night, using a DSL connection, so all changes made in
> the office during workday are available at the extranet the following day.
> Both DBMS are different so I'm accessing data through ODBC drivers.
>
We do a similar thing here at my work. We have a variety of data
imports to run and summary tables to build on a nightly basis. Now, as
you indicated, you can get as fancy and efficient as you want by
comparing PK's and such, but we went with the pure brute-force method
you mentioned first. Granted, this is all happening on our local
network, so we weren't concerned with data transfer rates, which could
turn out to be a problem for you depending on the quantity of data you
have to move.
As you mentioned, the obvious problem with the brute force method is
that between dumping the contents of the table and re-inserting, the
table (and anything that references it) is useless. We got around this
by using temporary tables to do the processing, then moving them into
place seamlessly when finished.
For example (in psuedo-mysql, so you'll need to adjust for your DB),
assuming 'mytable' is the table you're trying to update:
DROP TABLE mytable_new // just in case it still exists
CREATE TABLE mytable_new LIKE mytable
//do whatever you need to do for inserting into 'mytable_new'
DROP TABLE mytable_old // just in case it still exists
ALTER TABLE mytable RENAME mytable_old
ALTER TABLE mytable_new RENAME mytable
DROP TABLE mytable_old
As you can see, we create a temporary table just like the original and
do our inserts into that one. Then, when finished, move the old one
out of the way and put the new one in. Depending on your preference,
you can keep the _old table around to have a revolving backup, or just
toss it.
This has the effect of solving two of your potential issues: 1) the
extranet is not 'useless' while the import is taking place because the
old data doesn't leave until the new data is ready, and 2) if your
import crashes mid-execution, the only thing that should be affected is
the _new table and you can just drop it and start the import again
later if that happens. And it even might solve the issue of the speed
of data transfer, depending on how necessary it is to have yesterday's
data by start of business the next day. If that isn't essential, then
since the data import is invisible to the end user, it doesn't really
matter how long it takes (as long as its still < 24 hours, of course).
Hope that gives you some ideas,
Moot
[Back to original message]
|