Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 trying to figure out the best way to delete dups

Author  Topic 

Harry C
Posting Yak Master

148 Posts

Posted - 2008-06-09 : 12:57:59
I have a table like the following

cid, level, datechanged

The data looks like the following
1 , 1, 2007-10-25 13:36:03.263
1 , 2, 2007-10-25 13:36:03.263
1 , 3, 2007-10-25 13:36:03.263
1 , 4, 2007-10-25 13:36:03.263
1 , 5, 2007-10-25 13:36:03.263
1 , 1, 2006-11-22 13:36:03.263
1 , 2, 2006-11-22 13:36:03.263
1 , 3, 2006-11-22 13:36:03.263

14 , 1, 2008-10-15 13:36:03.263
14 , 2, 2008-10-15 13:36:03.263
14 , 3, 2008-10-15 13:36:03.263
14 , 4, 2008-10-15 13:36:03.263
14 , 1, 2007-11-12 13:36:03.263
14 , 2, 2007-11-12 13:36:03.263
14 , 3, 2007-11-12 13:36:03.263

So, to make a long story short the table SHOULD have a compound primary key in it that only allows one cid and one level. I should never be allowed to have more than one. But first I need to get the dups out. What I need to do is select the cid and level for the greatest datechanged, and delete everything prior for each cid. My table has over 25,000 rows in it, so I need to be careful about what I am doing. I can't just replace them manually. So in the above example, I would need to delete

1 , 1, 2006-11-22 13:36:03.263
1 , 2, 2006-11-22 13:36:03.263
1 , 3, 2006-11-22 13:36:03.263

AND

14 , 1, 2007-11-12 13:36:03.263
14 , 2, 2007-11-12 13:36:03.263
14 , 3, 2007-11-12 13:36:03.263

Any thoughts on how I could accomplish this? Thanks

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-06-09 : 13:39:25
[code]DELETE f
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Col1, Col2 ORDER BY Col3) AS RecID
FROM Table1
) AS f
WHERE RecID > 1[/code]


E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Harry C
Posting Yak Master

148 Posts

Posted - 2008-06-09 : 13:53:25
BEGIN TRANSACTION
DELETE f
FROM (
SELECT ROW_NUMBER () OVER (PARTITION BY cid, level ORDER BY datechanged) AS RecID
FROM PLevel
) AS f
WHERE RecID > 1
ROLLBACK TRANSACTION

I believe the DB is SQL Server 2000, as row_number comes back as Not Built In Function. Any thoughts?

Thanks
HC
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-06-09 : 14:01:51
[code]DELETE t
FROM YourTable t
INNER JOIN (SELECT cid,level
FROM YourTable
GROUP BY cid,level
HAVING COUNT(*)>1)t1
ON t1.cid=t.cid
AND t1.level=t.level
INNER JOIN (SELECT cid,level,MIN(datechanged) AS MinDate
FROM YourTable
GROUP BY cid,level)t3
ON t3.cid=t.cid
AND t3.level=t.level
AND t3.MinDate=t.datechanged[/code]
Go to Top of Page

Harry C
Posting Yak Master

148 Posts

Posted - 2008-06-09 : 14:16:37
Thank you, let me give that a shot. Much appreciated.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2008-06-09 : 14:22:28
Harry C, you shouldn't post your question in a 2005 forum if you are using 2000. I will move your thread to a 2000 forum.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-06-09 : 14:41:26
If you are indeed using SQL Server 2005, make sure compatibility level is set to 90.
What does SELECT @@VERSION return?



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Harry C
Posting Yak Master

148 Posts

Posted - 2008-07-15 : 16:09:20
I need to revisit this again...

Running the below, I get the following


(14648 row(s) affected)

(21130 row(s) affected)

(4940 row(s) affected)

The problem is that last row (4940 rows affected) should be 14648 row(s) affected. I think the problem is that t3.MAXDate is the same for alot of the dup records. So, what happens is that it deletes them all. This table right now has NO PK. If I add a PK, I believe that i should be able to just keep the HIGHEST PKs for each group pf CommunityID, and CreditLevel grouping. Any have any thoughts on the best way to do that? Thanks

BEGIN TRANSACTION



SELECT CommunityID,creditlevel FROM ParameterCreditLevel GROUP BY CommunityID,creditlevel ORDER BY CommunityID,creditlevel



DELETE t

FROM ParameterCreditLevel t

INNER JOIN (SELECT CommunityID,creditlevel

FROM ParameterCreditLevel

GROUP BY CommunityID,creditlevel

HAVING COUNT(*)>1)t1

ON t1.CommunityID=t.CommunityID

AND t1.creditlevel=t.creditlevel

INNER JOIN (SELECT CommunityID,creditlevel,MAX(LatestChangeDateTime) AS MAXDate

FROM ParameterCreditLevel

GROUP BY CommunityID,creditlevel)t3

ON t3.CommunityID=t.CommunityID

AND t3.creditlevel=t.creditlevel

AND t.LatestChangeDateTime<=t3.MAXDate



SELECT CommunityID,creditlevel FROM ParameterCreditLevel GROUP BY CommunityID,creditlevel ORDER BY CommunityID,creditlevel


ROLLBACK TRANSACTION

Go to Top of Page
   

- Advertisement -