Removing duplicates based on highest record id

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

Removing duplicates based on highest record id

Author

Topic

wembleybear
Yak Posting Veteran

93 Posts

Posted - 2013-12-20 : 05:05:58

I have a query which finds duplicate spec_items linked to a work order. What I want to do it remove the duplicates (and in some cases there will be more than one) leaving only the record with the highest [sr.id]

select sr.id, sr.linked_to_worknumber, sr.spec_checklist_id from spec_checklist_remind sr inner join spec_checklist_remind sc on sc.linked_to_worknumber = sr.linked_to_worknumber
group by sr.id,sr.linked_to_worknumber, sr.spec_checklist_id
Having sr.spec_checklist_id = 30 and count(*)>1
order by sr.linked_to_worknumber

Sample of query results:

id	linked_to_worknumber	spec_checklist_id
18323	IP-DN-033774	30
24229	IP-DN-033774	30
18294	IP-DN-033808	30
28027	IP-DN-033808	30
18295	IP-DN-033810	30
19924	IP-DN-033810	30

In the 3 cases, I would want to retain only records with ids 24229,28027 and 19924.

Many thanks
Martyn

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2013-12-20 : 05:15:36

I'm a bit confused about the example data above.
Can you please give table structure, sample data and wanted result instead of a query?

Too old to Rock'n'Roll too young to die.

wembleybear
Yak Posting Veteran

93 Posts

Posted - 2013-12-20 : 06:10:24

Webfred

Hope this will do:

Table structure: [url]http://imm.io/1mIpB[/url]

Sample data from table: [url]http://imm.io/1mIr8[/url]

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2013-12-20 : 06:48:56

So in the example data you want the row with id 19958 to be deleted because:
- it is a duplicate in linked_to_work_number
- the id is < 20616
- the spec_checklist_id is 30

Too old to Rock'n'Roll too young to die.

Kristen
Test

22859 Posts

Posted - 2013-12-20 : 06:54:06

This perhaps?


DECLARE @MyData TABLE
(
	id	int NOT NULL,
	linked_to_worknumber	varchar(20) NOT NULL,
	spec_checklist_id	int
)
INSERT INTO @MyData
SELECT	18323, 'IP-DN-033774', 30 UNION ALL
SELECT	24229, 'IP-DN-033774', 30 UNION ALL
SELECT	18294, 'IP-DN-033808', 30 UNION ALL
SELECT	28027, 'IP-DN-033808', 30 UNION ALL
SELECT	18295, 'IP-DN-033810', 30 UNION ALL
SELECT	19924, 'IP-DN-033810', 30

SELECT	*
FROM	@MyData

-- Delete duplicates (with lower ID)
DELETE	T1
FROM	@MyData AS T1
	JOIN @MyData AS T2
		 ON T2.id > T1.id
		AND T2.linked_to_worknumber = T1.linked_to_worknumber
-- ... add other columns here that need to match for the record to be considered a "duplicate" ...

SELECT	*
FROM	@MyData

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2013-12-20 : 06:58:25

I would have given a delete query using row_number() but FIRST I would like to know if my guessing about conditions is right...

Too old to Rock'n'Roll too young to die.

wembleybear
Yak Posting Veteran

93 Posts

Posted - 2013-12-20 : 07:10:00

webfred

Your guessing is correct about all three conditions (sorry the forum subscription emails are taking a while to get to me).

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2013-12-20 : 08:50:52

delete dt
from
(
select row_number() over(partition by linked_to_worknumber order by id desc) as rn, *
from spec_checklist_remind
where spec_checklist_id = 30
) as dt
where rn > 1

Too old to Rock'n'Roll too young to die.

wembleybear
Yak Posting Veteran

93 Posts

Posted - 2013-12-20 : 09:30:06

webfred

That's perfect, I just ran in the test database and worked exactly as expected.

Many thanks for your help today.

Martyn

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2013-12-20 : 09:33:14

my pleasure