Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Data Corruption Issues
 Cleaning Data Table

Author  Topic 

anoushm
Starting Member

2 Posts

Posted - 2012-08-30 : 20:01:23
I have a table that needs to be cleaned. I need to figure out a way pragmatically. I think this type issue is common and maybe someone has a solution or suggestion for it.

Basically the issues is, I have an employer table. It contains employer's name and address. The issue is that the employer's name is spelled several different ways for same employer. For example McDonalds is spelled like Mc Donalds, McDdonalds, The McDonalds, MacDonalds.

I need to figure out a way to have one correct common name for employer's that has same address. Basically the table needs to be cleaned. Is this possible to do pragmatically in a SQL script. It is stored in a SQL Server 2008 databse.

Thanks in advance

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-08-30 : 21:20:20
you have to have master list for doing this or make use of a fuzzy matching algorithm. In both cases it would be approximate method and would require several iterations to get it fixed.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2012-08-30 : 21:40:22
quote:
I need to figure out a way to have one correct common name for employer's that has same address.
If what you said about duplicates having the SAME ADDRESS, then you can group by the address and find which ones have dups. However, I highly doubt whether you have precise addresses when the employer's name was entered with such wanton abandon!
Go to Top of Page

anoushm
Starting Member

2 Posts

Posted - 2012-08-31 : 06:26:12
Yep the addresses are even worst. This table is horrible. So, i may need ask one of the PMs to clean this table in a excel file and then i imported back into the db. But if there are any other idea please give suggestions.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-08-31 : 10:18:35
Mostly this is what we do as a part of data quality exercise. MS even have Data Quality Services which we can utilise for data cleansing

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page
   

- Advertisement -