|
Posted by Dan Guzman on 08/21/06 13:03
> - The slowest thing one can do with a disk is to move the heads(seek
> time), so avoiding unnecessary seek time will improve overall throughput.
This is true and exactly why it is so important to isolate logs on separate
disks. Logs are mostly sequential writes so you want to avoid other random
I/O from interfering and you want to avoid RAID-5 overhead. Importantly,
SQL Server waits for the physical write to the log to complete during COMMIT
so log writes are on the critical path. Write caching on the log can help
greatly but the controller must guarantee eventual write to disk (e.g.
battery backup).
Similarly, if the workload is predictable enough so that you can isolate
sequential and random I/O activity (both reads and writes), this can also
improve disk efficiency.
> - The import job writes lots of data to data and index pages on extents
> located on disk.
If data are imported in sequence, isolating that index can improve disk
efficiency. However, the other indexes will likely be mantained randomly so
it's probably best to place those on the same array(s) as other general
purpose I/O.
>
> - All we do during import is inserting new rows in tables and the tables
> reside in their own files and the files reside on their own disks,
> (separate disks for table data and index data), so hopefully disk head
> movement will be minimal.
>
> - The slowest thing one can do with a disk is to move the heads(seek
> time), so avoiding unnecessary seek time will improve overall throughput.
>
Random I/O will likely result in some head movement so the heads are going
to move regardless of whether or not there are other objects on the drive.
The only advantage of isolating the objects is that the heads might not to
move as far, but that's probably a minor consideration.
> - The rest of the database has lots of traffic while we're importing,
> hundreds of SQL statements/second. The traffic is a mix of read and write
> to/from the disks.
>
> - The import job will finish faster on dedicated disks because of less
> disk head movement caused by other traffic. Other traffic will finish
> faster because of the same and because the import job "gets out of the
> way" sooner.
>
> What I aim for is less head movement and more parallell operations. This
> way the overall throughput will be better than if we only used a big
> RAID-10 array.
I hope you can find the time to do performance testing with the discussed
configurations. That's really the only way to tell for sure. I understand
that you are pressed to come up with a solution sooner than later so maybe
you can move logs as an interim step to buy some time do run the tests.
With all the discussion in this thread focusing on disk hardware, don't
forget the app side. If queries and indexes are poorly designed, you might
be able to get a dramatic improvement with a few strategic indexes or query
changes.
--
Hope this helps.
Dan Guzman
SQL Server MVP
"boa sema" <boasema@gmail.com> wrote in message
news:ecbvkb$bsi$1@bat-news01.banetele.basefarm.net...
> Robert Klemme wrote:
>> On 20.08.2006 19:59, boa wrote:
>>
>>> Our situation right now is that the import job blocks all other traffic,
>>> rendering the system close to unusable and the customers unhappy. The
>>> idea is to move the import job related tables and indexes to separate
>>> drives to reduce disk seek time during that 1 crucial minute(or 33% of
>>> the time). This will hopefully improve overall throughput more than
>>> having everything on one array.
>>
>>
>> Thanks for the interesting read! I'd like to play a bit devil's advocate
>> to improve my understanding here - feel free to correct me. Now, I figure
>> that if you physically separate your bulk loaded tables from other tables
>> you're reducing IO bandwidth for this set of tables and all others
>> (compared to distributing all data across all physical drives) by
>> basically reserving parts of that bandwidth for specific data. This
>> means that an import job with this configuration will take longer because
>> it has less IO bandwidth available. If pages for the import job tables
>> displace pages of other tables from the cache other operations will be
>> slowed down more than for the distributed case and they have to wait
>> *longer* for the import job to finish because of the reduced IO
>> bandwidth. So it seems to me that Dan's approach (spread all data files
>> evenly over available disk storage) is superior in this situation because
>> then the whole IO bandwidth is available all the time for all operations.
>> What do you think?
>
> I mostly agree with Dan. Let me first say that I don't know the correct
> answer to this and what may be correct for one configuration may be wrong
> for another. Anyway, here's my theory:
>
> - The import job writes lots of data to data and index pages on extents
> located on disk.
>
> - All we do during import is inserting new rows in tables and the tables
> reside in their own files and the files reside on their own disks,
> (separate disks for table data and index data), so hopefully disk head
> movement will be minimal.
>
> - The slowest thing one can do with a disk is to move the heads(seek
> time), so avoiding unnecessary seek time will improve overall throughput.
>
> - The rest of the database has lots of traffic while we're importing,
> hundreds of SQL statements/second. The traffic is a mix of read and write
> to/from the disks.
>
> - The import job will finish faster on dedicated disks because of less
> disk head movement caused by other traffic. Other traffic will finish
> faster because of the same and because the import job "gets out of the
> way" sooner.
>
> What I aim for is less head movement and more parallell operations. This
> way the overall throughput will be better than if we only used a big
> RAID-10 array.
>
> Do note that this is just a theory. Please prove me wrong, but please do
> it before we put the solution in production ;-) I don't mind being wrong
> at all, sometimes one can try to be too much of a smartass...
>
> Boa
>
>
>
>>
>> Kind regards
>>
>> robert
Navigation:
[Reply to this message]
|