|
Posted by Mike Husler on 06/19/06 19:09
I have been tasked with copying data from our SQL Server to a local
MySQL (windows).
About 75 tables are involved. The data arrive in SQL Server at
irregular cadences.
Some every 1/2 second, some every 1 minute, etc. Each row is a time
value and several
other values associated with that time such as flux or counts.
I created a DTS package that uses the "Transform Data Task" to push data
to "Other Connection".
"Other Connection" points to the MySQL driver that was set up on that DB
server.
I tried to execute several of these in the same DTS package but the
process took too long
so I broke up each table transfer into a separate DTS package and
scheduled each as a job
that runs 1 time per minute. I staggered each job so that they would
not compete with each
other as much. That is, some jobs would run on 5 seconds past the
minute, some at 10 seconds
past the minute, etc.
The low-cadence data job executed in sub-second time but the high
cadence data (1/2 second)
job took the better part of a minute to run. The Transform Data Task
executes a sql query stating:
select * from table_name where time_col > DATEADD(ss,-60,GETUTCDATE())
Basically, give me everything that has arrived in the latest minute on
SQL Server and place into MySQL.
As the SQL Server tables grow, these jobs take longer to execute. This
makes sense but
time_col is a primary key and if the above query is executed in Query
Analyzer it executes very fast (sub-second).
Besides, even when the tables contained very few rows, the package
execution was extremely slow.
My main question is where is the overhead in the DTS package that makes
it take so long to complete
when the select portion in theory is very fast. Is it the insert into
MySQL? Preparation?
Secondary question is would this be better to run as a stand-alone
application outside the database that
grabs from SQL Server and inserts into MySQL?
Any help appreciated.
Mike Husler
Navigation:
[Reply to this message]
|