Deleting Rows that have duplicate entries.

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Deleting Rows that have duplicate entries.

Author

Topic

algorithm
Starting Member

6 Posts

Posted - 2008-07-07 : 11:52:56

This is my first post - thanks for reading and helping, and I will try my best to explain this one.

I have created a query that creates a table from various table. I would like to create a query that deletes whole rows in tables where there are identical entries in two of the three columns.

e.g.
Name Number Code
X 2 5
Y 2 5
Z 4 7
-> In this case I would like to delete one of the first two rows (doesn't matter which) as the entries in Number AND Code are identical.

I have not included any code examples as I can't get any to work yet.

jdaman
Constraint Violating Yak Guru

354 Posts

Posted - 2008-07-07 : 12:06:01

[code]delete from YourTable
where name in (
select name
from ( select row_number() over ( partition by number, code order by name )
as rn, *
from YourTable
) a
where rn > 1 )[/code]

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2008-07-07 : 12:11:59

why don't you prevent the creation of identical entries in two of the three columns in the first place. GNIGNO (garbage not in garbage not out)

algorithm
Starting Member

6 Posts

Posted - 2008-07-07 : 12:22:01

Hi - thanks for these replies.
To jdaman - I tried your code, it is deleting all the rows - I would like to retain one of the 'duplicate' rows.

Yosiasz - your suggestion makes sense, assuming this condition will allow my original queries to run. I will look into it and reply here if I solve this.

Thanks again.

jdaman
Constraint Violating Yak Guru

354 Posts

Posted - 2008-07-07 : 12:40:14

Could you provide a sample of your existing data? Is the name column unique?

Here is my complete code which does leave the first instance of each number/code group:

declare @table table
    (
      name char(1),
      number int,
      code int
    )
insert  @table ( name, number, code )
select 'A', 1, 3 union all
select 'B', 1, 2 union all
select 'C', 2, 3 union all
select 'D', 0, 3 union all
select 'E', 1, 2 union all -- dup
select 'F', 0, 2 union all
select 'G', 0, 3 union all -- dup
select 'H', 1, 3 union all -- dup
select 'I', 2, 3 -- dup

delete  from @table
where   name in (
        select  name
        from    ( select    row_number() over ( partition by number, code order by name )
                            as rn, *
                  from      @table
                ) a
        where   rn > 1 )

select * from @table

jdaman
Constraint Violating Yak Guru

354 Posts

Posted - 2008-07-07 : 13:15:04

If you are in the situation where you do not have a primary key on the table and have entirely duplicate rows the following would eliminate those duplicates from your table:

select  *
into    #temp
from    YourTable
union
select  *
from    YourTable

delete  from YourTable

insert  YourTable
        select  *
        from    #temp

drop table #temp

select  *
from    YourTable

algorithm
Starting Member

6 Posts

Posted - 2008-07-07 : 18:42:30

Hi - to jdaman - no the name column is not unique. The column which I have called 'Name' here (in practice it will be called 'Entry_Name') will have people's names, then the other two columns will have phone numbers. I will try to implement your code in the morning and let you know how I get on. Many thanks for your replies.

Regarding your most recent reply, the entire rows are not duplicate, as I do not care what is in the first row, just if the second AND third columns are the same as the second AND third columns of another row, then I only need one of these rows. I am not using a primary key in the table.

jdaman
Constraint Violating Yak Guru

354 Posts

Posted - 2008-07-07 : 19:14:10

Knowing now that you dont have a primary key you will probably need to do something like the following:

declare @table table
    (
      entry_name char(30),
      phone_number int,
      code int
    )
insert  @table ( entry_name, phone_number, code )
select 'Mayes Tavern', 8746939, 01 union all
select 'Lloyd''s Bar/Lounge', 8746939, 01 union all
select 'The Baurnafea House', 4580842, 01 union all
select 'Joxer Dalys', 8601299, 01 union all
select 'Fagans Pub', 8369491, 01 union all
select 'Flanagans Bar (Peggy Kellys)', 4972445, 01 union all
select 'MayesTavern', 8746939, 01 union all
select 'Lloyds Bar', 8746939, 01 union all
select 'Lloyds Bar/Lounge', 8746939, 01 union all
select 'Baurnafea House', 4580842, 01 union all
select 'Joxer Daly', 8601299, 01 union all
select 'FaganPub', 8369491, 01 union all
select 'Fagan  Pub', 8369491, 01 union all
select 'The Baurnafea House', 4580842, 01 union all
select 'Flanagans Bar', 4972445, 01 union all
select 'Joxer Daly', 8369491, 01

if object_id('tempdb..#temp') is not null
drop table #temp

select  row_number() over ( partition by phone_number, code order by entry_name )
        as rn, *
into #temp
from    @table

delete from @table

insert @table
select entry_name, phone_number, code from #temp
where rn = 1

select * from @table

drop table #temp

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources