Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 How can I delete nearly-duplicate rows?

Author  Topic 

ssg31415926
Starting Member

4 Posts

Posted - 2010-01-19 : 14:06:13
I have a table with nearly-duplicate rows. I'm trying to get rid of the nearly duplicates. By nearly-duplicates, I mean that all of the columns except one have the same values.

Here's my table:

CREATE TABLE [dbo].[ActiveDirectory_Export](
[givenName] [nvarchar](255) NULL,
[surname] [nvarchar](255) NULL,
[samAccountName] [nvarchar](255) NULL,
[notesShortName] [nvarchar](255) NULL,
[employeeID] [nvarchar](255) NULL
) ON [PRIMARY]

A person can have more than one samAccountName for which all other values are the same. E.g.
givenName  surname  samAccountName  notesShortName  employeeID
Bob Smith ARZ1011 BSmith 1001
Bob Smith ARZ1011ep BSmith 1001
Lisa Smith FQYS4277 LSmith 2304
Lisa Smith FQYS4277ep LSmith 2304
Lisa Smith FQYS4277sup LSmith 2304


Sorted alphabetically on notesShortName then samAccountName, I want to retain the first row for each user and delete the rest.

I've been at this for over an hour and I'm stumped. How can I do this?

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2010-01-19 : 15:36:45
[code]WITH AD (id, givenName, surname, samAccountName, notesShortName, employeeID)
AS (
select row_number() over(partition by employeeID order by notesShortname) id, givenName, surname, samAccountName, notesShortName, employeeID
from ActiveDirectory_Export
)
DELETE x
FROM ActiveDirectory_Export x
JOIN AD
On AD.employeeID = x.employeeid
and AD.notesShortName = x.notesShortName
and AD.samAccountName = x.samAccountName
WHERE AD.id > 1[/code]
Go to Top of Page

ssg31415926
Starting Member

4 Posts

Posted - 2010-01-20 : 07:21:45
Thanks for your help.
Go to Top of Page
   

- Advertisement -