Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Other SQL Server 2008 Topics
 Validating City Names

Author  Topic 

TimSman
Posting Yak Master

127 Posts

Posted - 2013-05-13 : 08:41:30
Not sure exactly where to fit this question, so here goes...

Here's the situation:
We are implementing a new system at work, and part of the set-up requires me to identify unique city, state/province, and country combinations. I have customer and vendor addresses, each coming from different source databases and tables (and different systems). We have addresses in the U.S. and Canada at this point.

I am trying to clean up the city names (I have stuff in place to handle the state/province and country names & codes), and then generate unique codes for each (code format is PPCCCCCC, where PP is the state/province code, and CCCCCC is some abbreviation for the city name). Right now, it's the clean up part that is giving me fits; I have abbreviations and typos galore (Los Angeles, Las Angeles, Las Angels, LA, LosAngeles just to give you some idea).

I have looked at numerous options, but I'm either not sure exactly how to use them (Levenshtein Distance), or I'm unable to determine a good source of data (looking at web services).

I'm curious as to what approaches others have taken, or if anyone can point towards some sort of tool/API (free is always good...). Going forward, I know that we will be implementing standards related to how this data is entered, so if anyone has tips/thoughts on what standards have worked for them, that would be appreciated also.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-05-14 : 00:47:10
What we did was to use a Azure marketplace data service containing details on cities of US and Canada and create a knowledge base in Data Quality Services (DQS) based on that. Then we used DQS task in our ETL packages (SSIS) to do the cleansing and get correct mapping of city details. Of course we're on SQL 2012 and SSIS 2012 have DQS support.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2013-05-14 : 10:53:29
We use this service for validation and normalization of US addresses:
http://smartystreets.com/

You can batch validate/normalize a file of addresses, or call the API to validate addresses as they are entered into your system. The result gives you a valid US postal service address along with information like latitude/longitude, county, ZIP+4, etc.

They do not do Canadian addresses, but there are many other providers of this type of service.



CODO ERGO SUM
Go to Top of Page
   

- Advertisement -