| Author |
Topic  |
|
|
cstokes91
Yak Posting Veteran
USA
52 Posts |
Posted - 01/10/2013 : 13:06:17
|
I am trying to split address into house number and address but my problem is that the data is so random it is proving to be a challenge... Sometimes the data starts with a number some a name etc and there is no consistency. I realize this is going to be no where near perfect but I just want something that will take the house number (if the first part of the address is a number) and put that for house number. Otherwise it all goes into the column for address. Also, if there is a direction split that up, too...
Sample data is: 249 N Governors St Cherry Tree 564 glover rd regions lobby west baptist church lost rd and oprah st billy s/d 245754 oak tree dr xpost TOYOTA DEALERSHIP ON HAMILTON 6138 HAPPY LN 5643 hill lake way x pinedale dr snap bend s/d off texas
output: for first record pull out 249 for stno, e for sdir, and the rest for address.
245754, 6138, 5643 would all go into stno and the rest would go into address. I would prefer getting the 564 to go into stno too but I don't know if that is possible.
|
Edited by - cstokes91 on 01/10/2013 13:21:22
|
|
|
robvolk
Most Valuable Yak
USA
15559 Posts |
Posted - 01/10/2013 : 13:23:26
|
What about addresses like "15th Street #234"?
How many addresses do you have to process? Anything over 100K, save your sanity and buy software to do it. MelissaData has a few products that are reasonably priced. From my experience cleaning up shit addresses, it's either manual (<10K addresses) or specialized software. SQL is poorly suited for parsing and extracting address information.
There may be some SSIS components that can do the cleanup/parsing. You might want to check PragmaticWorks: http://www.pragmaticworks.com/Products.aspx |
 |
|
|
webfred
Flowing Fount of Yak Knowledge
Germany
8513 Posts |
|
|
cstokes91
Yak Posting Veteran
USA
52 Posts |
Posted - 01/10/2013 : 13:34:08
|
@webfred I am looking into that now... I have started writing something but I was seeing what people would come up with for this exact situation...
@robvolk My biggest concern is just doing the best we can with what we have and it doesn't by any means have to be perfect and get everything correct. I just want to grab the ones that have numbers at the beginning then throw that into stno, then if they are immediately followed by a direction, throw them into a direction, then whatever else throw it into the address. Whatever doesn't go in perfectly it doesn't matter because it will appear to be the same for the client, just not reporting. |
 |
|
| |
Topic  |
|