SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 Data Corruption Issues
 Cleaning Data Table
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

anoushm
Starting Member

2 Posts

Posted - 08/30/2012 :  20:01:23  Show Profile  Reply with Quote
I have a table that needs to be cleaned. I need to figure out a way pragmatically. I think this type issue is common and maybe someone has a solution or suggestion for it.

Basically the issues is, I have an employer table. It contains employer's name and address. The issue is that the employer's name is spelled several different ways for same employer. For example McDonalds is spelled like Mc Donalds, McDdonalds, The McDonalds, MacDonalds.

I need to figure out a way to have one correct common name for employer's that has same address. Basically the table needs to be cleaned. Is this possible to do pragmatically in a SQL script. It is stored in a SQL Server 2008 databse.

Thanks in advance

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 08/30/2012 :  21:20:20  Show Profile  Reply with Quote
you have to have master list for doing this or make use of a fuzzy matching algorithm. In both cases it would be approximate method and would require several iterations to get it fixed.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

sunitabeck
Flowing Fount of Yak Knowledge

5155 Posts

Posted - 08/30/2012 :  21:40:22  Show Profile  Reply with Quote
quote:
I need to figure out a way to have one correct common name for employer's that has same address.
If what you said about duplicates having the SAME ADDRESS, then you can group by the address and find which ones have dups. However, I highly doubt whether you have precise addresses when the employer's name was entered with such wanton abandon!

Edited by - sunitabeck on 08/30/2012 21:40:33
Go to Top of Page

anoushm
Starting Member

2 Posts

Posted - 08/31/2012 :  06:26:12  Show Profile  Reply with Quote
Yep the addresses are even worst. This table is horrible. So, i may need ask one of the PMs to clean this table in a excel file and then i imported back into the db. But if there are any other idea please give suggestions.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 08/31/2012 :  10:18:35  Show Profile  Reply with Quote
Mostly this is what we do as a part of data quality exercise. MS even have Data Quality Services which we can utilise for data cleansing

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.05 seconds. Powered By: Snitz Forums 2000