| Author |
Topic |
|
ppatel112
Starting Member
35 Posts |
Posted - 2011-06-06 : 02:14:31
|
| Hi All,i have a issue where due to a product bug we have duplicate email addresses in the table, i need to remvove the duplicates from the table - need help building the sql query -- finding duplicate rows in a tableselect * from emailwhere Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))and emai_emailaddress = 'maverick@aol.com'this will return below:emai_emailid emai_companyid emai_personid emai_emailaddress27 25 NULL maverick@aol.com29 25 27 maverick@aol.com26 25 NULL maverick@aol.com29 25 NULL maverick@aol.com in order to get the correct records i only need emai_emailid = 29 and only one among emai_emailid (27,26 or 29)i can write the statement below to narrow down the query:select * from emailwhere Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))and emai_emailaddress = 'maverick@aol.com'and emai_personid IS NULLthat will return me below, now i need to get rid of one of the following three records and just get two records only so that i can run an update statement.emai_emailid emai_companyid emai_personid emai_emailaddress27 25 NULL maverick@aol.com26 25 NULL maverick@aol.com29 25 NULL maverick@aol.com please advice how to achive this.regards |
|
|
jfarrugia
Yak Posting Veteran
55 Posts |
|
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2011-06-06 : 12:52:12
|
you do know your email table is not well designed right? Until you fix that major design issue you will be cleaning up that table for a long time to come. Ok now that we have that out of the way you can also try SELECT * , ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rn FROM dbo.bootlegemailtable Then see the value of rn. What do you see in there that can help you sort things out?If you don't have the passion to help people, you have no passion |
 |
|
|
ppatel112
Starting Member
35 Posts |
Posted - 2011-06-13 : 20:29:35
|
| Thanks for this, it helped me to retrieve the records using row number but now the issue is with update statement where i want to soft delete the records i.e. marke emai_deleted =1 tried couple of things but didnt help:1.Update email set emai_deleted = NULL Where EXISTS(SELECT * FROM (SELECT * , ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rowno FROM dbo.emailwhere Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))and emai_personid IS NULL) as duplicateswhere rowno >1)2. with emailas(SELECT * FROM (SELECT * , ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rowno FROM dbo.emailwhere Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))and emai_personid IS NULL) as duplicateswhere rowno >1)update emailset emai_deleted = 1goplease help.regards |
 |
|
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2011-06-13 : 22:54:37
|
| which one do you want to delete of the duplicates what is your criteria? one with highest email id or lowest email id?If you don't have the passion to help people, you have no passion |
 |
|
|
ppatel112
Starting Member
35 Posts |
Posted - 2011-06-14 : 00:28:09
|
| Hi there,i dont want to delete any records, we have a flexibility where we can soft delete the records by updating emai_deleted =1 from emai_deleted = NULLso i want to run an update statement based on the select query below to mark the records with row no greater than 1 to emai_deleted = 1 i.e. update email set emai_deleted = 1 ( ??????)SELECT * FROM (SELECT * ,ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rownoFROM dbo.emailwhere Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2))and emai_personid IS NULL) as duplicateswhere rowno >1hope that clarifies what i want to achieve.regards |
 |
|
|
ppatel112
Starting Member
35 Posts |
Posted - 2011-06-15 : 22:25:08
|
| Hi Guys can anyone please help |
 |
|
|
webfred
Master Smack Fu Yak Hacker
8781 Posts |
Posted - 2011-06-16 : 01:57:13
|
Based on your given query:update duplicatesset emai_deleted = 1from(SELECT * , ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_emailaddress DESC) as rowno FROM dbo.email where Emai_emailaddress in (select Emai_emailaddress from email GROUP BY Emai_emailaddress HAVING (count(Emai_emailaddress)>2)) and emai_personid IS NULL) as duplicateswhere rowno >1 No, you're never too old to Yak'n'Roll if you're too young to die. |
 |
|
|
webfred
Master Smack Fu Yak Hacker
8781 Posts |
Posted - 2011-06-16 : 02:16:10
|
But be aware that... HAVING (count(Emai_emailaddress)>2) ...indicates that you only want to do the update if there are MORE than 2 duplicates! No, you're never too old to Yak'n'Roll if you're too young to die. |
 |
|
|
webfred
Master Smack Fu Yak Hacker
8781 Posts |
Posted - 2011-06-16 : 02:32:20
|
I believe you only need this:update duplicatesset emai_deleted = 1from(SELECT * , ROW_NUMBER() OVER(PARTITION BY emai_emailaddress ORDER BY emai_personid DESC) as rowno FROM dbo.email) as duplicateswhere rowno >1and if you really want to keep TWO rows then use this WHERE:where rowno > 2 No, you're never too old to Yak'n'Roll if you're too young to die. |
 |
|
|
|