SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Merge rows depending on amount of data
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

cidr
Posting Yak Master

United Kingdom
207 Posts

Posted - 05/07/2013 :  04:42:23  Show Profile  Reply with Quote
Hi,

Just a quick one for advice. I have two files based on contact details and there is duplicate data between them.

I'm referencing both tables using Email Address. I need to delete duplicate rows.

However, there may be more fields with data on the other table (home address, phone number may be blank on one row and the other tables matching row, they are populated)

so, for rows where there is a matching email address on the other table I am required to add data to fields that are null where the respective field has data.

This means I capture as much data as I can before deleting rows that have matching email addresses.

I guess the main thing to note is that there is not one table with more data than the other. one table can have more data than the other in some row fields and vice versa.

Just want to ask what the best way is to do this as there is about 30 columns on each table.

Thanks

Edited by - cidr on 05/07/2013 04:53:49

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 05/07/2013 :  06:10:18  Show Profile  Reply with Quote
i think you need two logics to make sure each sheet updates missing data from the other. then you can apply logic to remove duplicates.
But one question here would be what if you've same row (identified by Email) existing on both sheets but with different set of data? In that case, which one will you retain?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

cidr
Posting Yak Master

United Kingdom
207 Posts

Posted - 05/07/2013 :  06:28:40  Show Profile  Reply with Quote
quote:
Originally posted by visakh16

But one question here would be what if you've same row (identified by Email) existing on both sheets but with different set of data? In that case, which one will you retain?


It's a good question. It may be possible that a home address may be slightly different than the other. I'm not too concerned with this scenario however, only with rows that have blank fields.

If table1 has a blank postcode, and table2 has a postcode where there are matching emails in each table, I'd want to add postcode to the rows that is blank. This scenario can happen with table2 having more data than table1 as well.

I can only think of a manual script for this. Is there any functions that can help?
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 05/07/2013 :  06:45:39  Show Profile  Reply with Quote
you can write a logic like

UPDATE t1
SET t1.Field1 = COALESCE(t1.Field1,t2.Field1),
 t1.Field2 = COALESCE(t1.Field2,t2.Field2)
...
FROM table1 t1
JOIN table2 t2
ON t2.Email=t1.Email
WHERE t1.Field1 IS NULL 
OR t1.Field2 IS NULL
OR ...

UPDATE t2
SET t2.Field1 = COALESCE(t2.Field1,t1.Field1),
 t2.Field2 = COALESCE(t2.Field2,t1.Field2)
...
FROM table1 t1
JOIN table2 t2
ON t2.Email=t1.Email
WHERE t2.Field1 IS NULL 
OR t2.Field2 IS NULL
OR ...



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.25 seconds. Powered By: Snitz Forums 2000