Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
GhantaBro
Posting Yak Master
215 Posts |
Posted - 2009-08-20 : 06:24:50
|
| I have a table with 4 billion records which I need to dedupe... (not many duplicates though)..which is more efficient..Select distinct * into new_tbl from old_tabledrop old_tablerename new_tblordeduping is faster by deleting duplicate records from the table? Thanks! |
|
|
DonAtWork
Master Smack Fu Yak Hacker
2167 Posts |
Posted - 2009-08-20 : 06:54:54
|
i can guess which is less disk intensive http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspxHow to ask: http://weblogs.sqlteam.com/brettk/archive/2005/05/25.aspxFor ultra basic questions, follow these links.http://www.sql-tutorial.net/ http://www.firstsql.com/tutor.htm http://www.w3schools.com/sql/default.asp |
 |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2009-08-20 : 09:39:17
|
| If depends, but if the number of dupes is small relative to the size of the table, delete will be faster.It will probably be more work to find the dupes than to delete them.CODO ERGO SUM |
 |
|
|
DonAtWork
Master Smack Fu Yak Hacker
2167 Posts |
Posted - 2009-08-20 : 11:05:55
|
| Either way you do it, make sure you implement a way to PREVENT dupes from getting in there!http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspxHow to ask: http://weblogs.sqlteam.com/brettk/archive/2005/05/25.aspxFor ultra basic questions, follow these links.http://www.sql-tutorial.net/ http://www.firstsql.com/tutor.htm http://www.w3schools.com/sql/default.asp |
 |
|
|
shebert
Yak Posting Veteran
85 Posts |
Posted - 2009-08-20 : 14:27:08
|
| One thing you can do is find the duplicates and copy single instance into a temp table.with query like thisselect assembly_lot_id, count(assembly_lot_id) as occurs from assembly_lot group by assembly_lot_id having (count(assembly_lot_id)>1)then delete the duplicate records in your tablethen replace the single records back in unduplicated |
 |
|
|
|
|
|