How do i delete duplicate data? - SQL Server Forums

Home | Weblogs | Forums

Site Sponsored By: SQLDSC - SQL Server Desired State Configuration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

How do i delete duplicate data?

Author

Topic

Cwm
Starting Member

39 Posts

Posted - 2008-12-15 : 16:33:12

hi
I was wondering how i go to delete duplicate data when i go to insert a bunch of data from a csv file?
The reason is becuase i run a timer to read an appended text file, seperated by comma, i will get duplicate data that i want to make sure tha it doesn't get inserted again into the table.

Thanks

Chris

sodeep
Master Smack Fu Yak Hacker

7174 Posts

Posted - 2008-12-15 : 16:38:38

Load it into staging table,perform delete duplicate operation and load into SQL Destination table.

See if in SQL 2005:
http://sqlis.com/post/Get-all-from-Table-A-that-isnt-in-Table-B.aspx

Cwm
Starting Member

39 Posts

Posted - 2008-12-15 : 17:29:24

Hi,
That article looks confusing. Is there an easier way?

Thanks

sodeep
Master Smack Fu Yak Hacker

7174 Posts

Posted - 2008-12-15 : 17:35:24

How your data look like after loading into staging table? Can you give sample data and output?

revdnrdy
Posting Yak Master

220 Posts

Posted - 2008-12-15 : 18:28:24

Here is how I do it (our database is not normalized btw). This presumes you have a primary ID key (or unique ID key) establised.

This query deletes duplicate data based on 2 matching fields (manufacturer and partnumber).

This is also known as a self-join operation (I think)

DELETE
FROM ProductsTable
WHERE ID
IN(select a.ID
from ProductsTABLE a,ProductsTable b
where a.manufacturer=b.manufacturer
And a.partnumber=b.partnumber
And a.ID<b.ID)

Cwm
Starting Member

39 Posts

Posted - 2008-12-15 : 19:27:56

Thanks revdnrdy,
I have found something similar to yours, and it is

select distinct TicketNumber into Newtable from [CompletedOrders]
delete T1
from [CompletedOrders] T1, [CompletedOrders] T3
where T1.TicketNumber = T3.TicketNumber
and T1.COID > T3.COID

Actually it looks pretty identical to what yours is, all i am doing is deleting the duplicates based on two matching fields, the TicketNumber and the autoincrement id.

Thanks

Chris

sodeep
Master Smack Fu Yak Hacker

7174 Posts

Posted - 2008-12-15 : 19:44:49

Test it before you apply with select:

Delete t
from
(select row_number() over (partition by TicketNumber order by id)as seq,* from table)t
where t.seq>1

Jai Krishna
Constraint Violating Yak Guru

333 Posts

Posted - 2008-12-15 : 23:52:39

Try this

DELETE FROM your_table
WHERE autoincrementid NOT IN (SELECT MAX(autoincrementid) FROM your_table GROUP BY TicketNumber)

Jai Krishna

NeilG
Aged Yak Warrior

530 Posts

Posted - 2008-12-16 : 10:04:34

Cwm,
Either are the same but try to avoid the distinct clause where you can but not vital, many people don't like this being used on large table for perfomance reasons

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources