Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Deleting invalid zip codes

Author  Topic 

Lionheart
Starting Member

41 Posts

Posted - 2008-09-10 : 10:19:20
I have a database which has a field containing zip codes for the US. Some of the data has been entered poorly, and as a result there are a lot of zip codes with errors.

Does anybody have a code that will either:

a) clean invalid (i.e. non numeric) characters from the field

or

b) delete all invalid zip codes completely

Thanks

LH

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-10 : 10:40:09
a) yes.
b) yes.

select * from table1 where zipcode like '%[^0-9]%'

Post some samples of the data you have and explain what is invalid for you.
Please include both valid and invalid zip codes.



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-09-10 : 11:33:05
As Peso says, the first step isn't writing code, it is analyzing your data and writing out the logic you wish to apply.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

Lionheart
Starting Member

41 Posts

Posted - 2008-09-10 : 16:36:51
Thanks for the help. Having thought it through I worked it out using the following - thought I would post it just in case someone else has the same issue.

select state, zipcode
from location
where country = 'US' and postalcode not between '0' and '99999' and postalcode <> ''
group by postalcode, statecode
order by postalcode desc


This created a list of bad zipcodes in a state. Then I cleaned using

update location
set zipcode = '99999'
from location
where country = 'US' and zipcode not between '0' and '99999' and zipcode <> ''


99999 is set as default for this task.
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-09-10 : 16:44:14
I don't know how many ZIP Codes you have, or how bad the data is, but keep in mind that

0abcdefh
0 34 11-22344
0000022222333344444
01
99
999xxx

... are considered a "good" ZIP Codes according to your criteria. Those values are between "0" and "99999", and not equal to "".

Don't just look at the bad zip codes your algorithm returns when testing your code -- look at the ones it considers "good" as well.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2008-09-10 : 16:48:10
Also, read about the LIKE operator. It will probably do what you want. To be really strict and only allow strings with exactly 5 numbers, you can write:

where zipcode like '[0-9][0-9][0-9][0-9][0-9]'

... and things along those lines.

- Jeff
http://weblogs.sqlteam.com/JeffS
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-10 : 17:20:04
What about "ZIP+4" zipcodes in the USA?



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2008-09-10 : 17:53:51
quote:
Originally posted by Lionheart

Thanks for the help. Having thought it through I worked it out using the following - thought I would post it just in case someone else has the same issue.

select state, zipcode
from location
where country = 'US' and postalcode not between '0' and '99999' and postalcode <> ''
group by postalcode, statecode
order by postalcode desc


This created a list of bad zipcodes in a state. Then I cleaned using

update location
set zipcode = '99999'
from location
where country = 'US' and zipcode not between '0' and '99999' and zipcode <> ''


99999 is set as default for this task.


Your select fails to identify these bad zip codes:

select
postalcode
from
( -- Test Data
select postalcode = '0aaaa' union all
select postalcode = '9 999'
) a
where
postalcode not between '0' and '99999' and postalcode <> ''


Results:
postalcode
----------

(0 row(s) affected)

You should look at the suggestions other people have posted on this thread.

You should also consider getting a subscription to a list of valid zip codes. Many sources are available.

I ahve used this one. US $99.95 a year for commercial edition subscription with monthly updates or $59.95 a year for monthly subscription for the the basic edition:
http://www.zipcodedownload.com/Products/Category/USA/







CODO ERGO SUM
Go to Top of Page
   

- Advertisement -