|
Posted by Tony Rogerson on 05/31/06 11:42
> SELECT count(*),A.B,C,D
> FROM tbl
> GROUP BY A,B,C,D
> HAVING count(*)>1
If that query takes 2 weeks to run then you've got serious and I mean
serious problems with your hardware, that type of query should take minutes
if not seconds most desktops let alone servers.
To aid performance for that specific query create your clustered index on A,
B, C and D.
--
Tony Rogerson
SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant
http://sqlserverfaq.com - free video tutorials
"groupy" <liav.ezer@gmail.com> wrote in message
news:1149065676.453810.137020@i39g2000cwa.googlegroups.com...
> Ok, here is a asample table representing the problem more clearly
>
> A | B | C | D
> -----------------
> a1 b1 c1 d1
> a1 b2 c2 d2
> a3 b3 c1 d3
> a4 b4 c4 d3
> a5 b5 c5 d5
> a6 b6 c6 d3
>
> Tha duplications are:
> row 1+2 in param A
> row 1+3 in param C
> row 3+4+6 in param D
> only row 5 is unique in all parameters.
> conclusion: row 1+2+3+4+6 are the same user
> goal: to find all duplicated rows & to delete them all accept one
> instance to leave.
>
> Note:
> Finding that row 1similar to 2 in A & deleting it will loose data
> because we won't know that row 1 is ALSO similar to 3 on C & later on
> finding that 3 is similar to 4 & 6 on D & so on
>
> The simple time consuming (about 2 weaks) query to acomplish the task
> is:
> SELECT count(*),A.B,C,D
> FROM tbl
> GROUP BY A,B,C,D
> HAVING count(*)>1
>
> I THANK YOU ALL
>
Navigation:
[Reply to this message]
|