Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
masterdineen
Aged Yak Warrior
550 Posts |
Posted - 2013-01-13 : 16:01:15
|
Hello there.Does anyone know of any good de duping methods between two table tables.or is there no common or best practice way of doing it. Just simply de-duping on a unique column that would be in both tables and perform sub querys ie ( where not in) or (exists )I dont have an example yet, just wondering if there are any good ideas.any help would be appreciated.Thank you. |
|
jimf
Master Smack Fu Yak Hacker
2875 Posts |
Posted - 2013-01-13 : 17:23:39
|
It really depends on what you are trying to accomplish. EXISTS and NOT EXISTS are good options. So are INTERSECT and EXCEPT, as well as MERGE WHEN MATCHED ON SOURCE. The better question is, why do you need to de-dup between two tables?JimEveryday I learn something that somebody else already knew |
|
|
Jeff Moden
Aged Yak Warrior
652 Posts |
Posted - 2013-01-13 : 21:59:31
|
JimF touched on many of the methods above. The reason why someone would want to do this is typically in the area of ETL. I consider it to be fool-hardy to try an import data directly to a final table. It think it's much safer to load the data into a staging table, validate it, identify what is new and what must be updated, and only then start adding to or modifying the target table. It usually turns out to be faster, as well because I don't generally have to do joined inserts or updates on a table that is in use. No blocking to worry about on the staging table.--Jeff Moden RBAR is pronounced "ree-bar" and is a "Modenism" for "Row By Agonizing Row".First step towards the paradigm shift of writing Set Based code:"Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column." When writing schedules, keep the following in mind:"If you want it real bad, that's the way you'll likely get it." |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2013-01-13 : 22:30:34
|
We dump the incoming data onto staging table and then do all validations, checks, transformation etc as Jeff suggested. The logic for data transfer from source to staging would be straight pull. For insert/updates we make use of datetime fields to compare between source and destination and do insert/updates. To compare, we can use several methods1. MERGE2. EXISTS/NOT EXISTS3. LEFT JOIN / INNER JOIN4. IN/NOT IN------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
|
|
|
|
|
|
|