Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Removing Duplicates from a table

Author  Topic 

ppatel112
Starting Member

35 Posts

Posted - 2011-06-06 : 02:14:31
Hi All,

i have a issue where due to a product bug we have duplicate email addresses in the table, i need to remvove the duplicates from the table - need help building the sql query

-- finding duplicate rows in a table
select * from email
where Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))
and emai_emailaddress = 'maverick@aol.com'

this will return below:
emai_emailid emai_companyid emai_personid emai_emailaddress
27 25 NULL maverick@aol.com
29 25 27 maverick@aol.com
26 25 NULL maverick@aol.com
29 25 NULL maverick@aol.com

in order to get the correct records i only need emai_emailid = 29 and only one among emai_emailid (27,26 or 29)

i can write the statement below to narrow down the query:
select * from email
where Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))
and emai_emailaddress = 'maverick@aol.com'
and emai_personid IS NULL

that will return me below, now i need to get rid of one of the following three records and just get two records only so that i can run an update statement.

emai_emailid emai_companyid emai_personid emai_emailaddress
27 25 NULL maverick@aol.com
26 25 NULL maverick@aol.com
29 25 NULL maverick@aol.com

please advice how to achive this.

regards

jfarrugia
Yak Posting Veteran

55 Posts

Posted - 2011-06-06 : 06:31:36
I had the same problem and followed the below link... worked like a charm.

http://support.microsoft.com/kb/139444

Where software development knowledge meets the reader
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2011-06-06 : 12:52:12
you do know your email table is not well designed right? Until you fix that major design issue you will be cleaning up that table for a long time to come. Ok now that we have that out of the way you can also try


SELECT * ,
ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rn
FROM dbo.bootlegemailtable


Then see the value of rn. What do you see in there that can help you sort things out?

If you don't have the passion to help people, you have no passion
Go to Top of Page

ppatel112
Starting Member

35 Posts

Posted - 2011-06-13 : 20:29:35
Thanks for this, it helped me to retrieve the records using row number but now the issue is with update statement where i want to soft delete the records i.e. marke emai_deleted =1
tried couple of things but didnt help:
1.

Update email set emai_deleted = NULL Where EXISTS(
SELECT * FROM
(SELECT * ,
ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rowno
FROM dbo.email
where Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))
and emai_personid IS NULL) as duplicates
where rowno >1)

2.
with email
as
(
SELECT * FROM
(SELECT * ,
ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rowno
FROM dbo.email
where Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))
and emai_personid IS NULL) as duplicates
where rowno >1
)
update email
set emai_deleted = 1
go


please help.

regards
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2011-06-13 : 22:54:37
which one do you want to delete of the duplicates what is your criteria? one with highest email id or lowest email id?

If you don't have the passion to help people, you have no passion
Go to Top of Page

ppatel112
Starting Member

35 Posts

Posted - 2011-06-14 : 00:28:09
Hi there,
i dont want to delete any records, we have a flexibility where we can soft delete the records by updating emai_deleted =1 from emai_deleted = NULL
so i want to run an update statement based on the select query below to mark the records with row no greater than 1 to emai_deleted = 1 i.e. update email set emai_deleted = 1 ( ??????)
SELECT * FROM
(SELECT * ,
ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rowno
FROM dbo.email
where Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))
and emai_personid IS NULL) as duplicates
where rowno >1

hope that clarifies what i want to achieve.

regards
Go to Top of Page

ppatel112
Starting Member

35 Posts

Posted - 2011-06-15 : 22:25:08
Hi Guys can anyone please help
Go to Top of Page

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2011-06-16 : 01:57:13
Based on your given query:
update duplicates
set emai_deleted = 1
from
(SELECT * ,
ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rowno
FROM dbo.email
where Emai_emailaddress in
(select Emai_emailaddress
from email
GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))
and emai_personid IS NULL
) as duplicates
where rowno >1



No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2011-06-16 : 02:16:10
But be aware that
... HAVING (count(Emai_emailaddress)>2) ...
indicates that you only want to do the update if there are MORE than 2 duplicates!


No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2011-06-16 : 02:32:20
I believe you only need this:
update duplicates
set emai_deleted = 1
from
(SELECT * ,
ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_personid DESC) as rowno
FROM dbo.email
) as duplicates
where rowno >1

and if you really want to keep TWO rows then use this WHERE:
where rowno > 2



No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page
   

- Advertisement -