Merge rows depending on amount of data

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

Merge rows depending on amount of data

Author

Topic

cidr
Posting Yak Master

207 Posts

Posted - 2013-05-07 : 04:42:23

Hi,

Just a quick one for advice. I have two files based on contact details and there is duplicate data between them.

I'm referencing both tables using Email Address. I need to delete duplicate rows.

However, there may be more fields with data on the other table (home address, phone number may be blank on one row and the other tables matching row, they are populated)

so, for rows where there is a matching email address on the other table I am required to add data to fields that are null where the respective field has data.

This means I capture as much data as I can before deleting rows that have matching email addresses.

I guess the main thing to note is that there is not one table with more data than the other. one table can have more data than the other in some row fields and vice versa.

Just want to ask what the best way is to do this as there is about 30 columns on each table.

Thanks

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-05-07 : 06:10:18

i think you need two logics to make sure each sheet updates missing data from the other. then you can apply logic to remove duplicates.
But one question here would be what if you've same row (identified by Email) existing on both sheets but with different set of data? In that case, which one will you retain?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs

cidr
Posting Yak Master

207 Posts

Posted - 2013-05-07 : 06:28:40

quote:
Originally posted by visakh16

But one question here would be what if you've same row (identified by Email) existing on both sheets but with different set of data? In that case, which one will you retain?

It's a good question. It may be possible that a home address may be slightly different than the other. I'm not too concerned with this scenario however, only with rows that have blank fields.

If table1 has a blank postcode, and table2 has a postcode where there are matching emails in each table, I'd want to add postcode to the rows that is blank. This scenario can happen with table2 having more data than table1 as well.

I can only think of a manual script for this. Is there any functions that can help?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-05-07 : 06:45:39

you can write a logic like


UPDATE t1
SET t1.Field1 = COALESCE(t1.Field1,t2.Field1),
 t1.Field2 = COALESCE(t1.Field2,t2.Field2)
...
FROM table1 t1
JOIN table2 t2
ON t2.Email=t1.Email
WHERE t1.Field1 IS NULL 
OR t1.Field2 IS NULL
OR ...

UPDATE t2
SET t2.Field1 = COALESCE(t2.Field1,t1.Field1),
 t2.Field2 = COALESCE(t2.Field2,t1.Field2)
...
FROM table1 t1
JOIN table2 t2
ON t2.Email=t1.Email
WHERE t2.Field1 IS NULL 
OR t2.Field2 IS NULL
OR ...

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources