Multicolumn correlated subquery? — MsSQL Server

You are here: Multicolumn correlated subquery? « MsSQL Server « IT news, forums, messages

Posted by sql_server_user on 05/09/06 17:56

Hi,

I have a history table with about 400 million rows, with a unique
composite nonclustered index on two columns (object id and time period)
which is used for most of the queries into the table, and an identity
column for the clustered primary key.

Many of my queries use correlated subqueries to pull unique history
rows from the history table for each of a set of objects from the
object table, for instance, pulling the earliest history row for each
object in a set. These correlated subqueries reference the object table
and return the primary key of the history table, e.g.:

select *
from lp_object l
inner join lp_object_history h
on h.lp_object_id = l.lp_id

where l.lp_set_id = 'SOME_LITERAL'

and h.lp_id = (
select top 1 lp_id
from lp_object_history
where lp_object_id = l.lp_id
and lp_some_column > 0
order by lp_time_period)

Now, if lp_some_column is not indexed, this query has no choice but to
read the entirety of every single history row for every object in the
set where lp_set_id = 'SOME_LITERAL', so that it can determine if
lp_some_column > 0, and because the history table is clustered by the
identity column rather than the ID of the relevant object whose history
we're tracking, the reads take forever - they have to bop all around
the disk. The sets I deal with tend to have about 5K objects in them
and about 200K associated history rows.

I'm considering reclustering by the (object id, time period) index, but
then my queries will need an extra bookmark lookup step to get the row
data from the identity value returned by the correlated subquery. I
think it will still be faster, though, so I will probably build a copy
of the table with the alternative clustering scheme to run some
performance tests.

What I'm wondering is, if I were to dispense with the identity column
altogether and replace it with a composite primary key of (object id,
time period), would I be still be able to use my correlated subqueries?
Because then there wouldn't be a single column that uniquely identifies
each row in the history table and I don't think SQL Server supports
multicolumn correlated subqueries.

Thanks for reading,
Seth

Navigation:

Next in forum: Re: RAID 5 beats RAID 10
Prev in forum: Re: Running SQL and Oracle on the same server
Thread view: Multicolumn correlated subquery?

[Reply to this message]

Удаленная работа для программистов • Как заработать на Google AdSense • England, UK • статьи на английском • PHP MySQL CMS Apache Oscommerce • Online Business Knowledge Base • DVD MP3 AVI MP4 players codecs conversion help

Home • Search • Site Map • Set as Homepage • Add to Favourites

Сайт изготовлен в Студии Валентина Петручека —
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация