| Author |
Topic  |
|
|
lane0618
Posting Yak Master
134 Posts |
Posted - 06/22/2005 : 12:53:56
|
I want to be able to return only the numbers from an address field. What is the best way to do this?
Address Field: 1746 Wellesley P.O. Box 3175 4141 65th Street 773 Florecita Terrace
Should return: 1736 3177 415165 770
|
|
|
nathans
Aged Yak Warrior
USA
933 Posts |
Posted - 06/22/2005 : 13:51:30
|
This script will extract digits from the column, beginning with the first number and continuing until the first space (which I append to those values ending with a number).
* Beware of addresses like 'P.O. Box 69L' which would return '69L'
You might have to customize to fit your needs, but this should help get you going. Post again with any questions.
Uncomment the start and length columns to get an idea of how its working
declare @address table (address varchar(100))
insert into @address
select '1746 Wellesley' union
select 'P.O. Box 3175' union
select '4141 65th Street' union
select '773 Florecita Terrace'
select address,
-- patindex('%[0-9]%', address + space(1)) as 'start',
-- charindex(' ', address + space(1), patindex('%[0-9]%', address)) as 'length',
case
when patindex('%[^0-9]%' , address) > 0
then substring(address, patindex('%[0-9]%', address), charindex(' ', address + space(1), patindex('%[0-9]%', address)))
else address
end as 'numeric'
from @address |
 |
|
|
lane0618
Posting Yak Master
134 Posts |
Posted - 06/22/2005 : 14:24:35
|
Thanks, its close but here where a few that i didn't work on:
ROUTE 4 BOX 454 709 HIGH TOP RD PO BOX 92 4970 SABAL PALM BLVD. #208
Thanks! |
 |
|
|
TG
Flowing Fount of Yak Knowledge
USA
5501 Posts |
Posted - 06/22/2005 : 14:52:26
|
what will you use this for?
this should do it:
------------------------------------------------------------------------
if object_id('dbo.fnNumbersFromStr') > 0
drop function dbo.fnNumbersFromStr
go
create function dbo.fnNumbersFromStr(@str varchar(8000))
returns varchar(8000)
as
begin
declare @n varchar(8000)
declare @Numbers table (rowid int identity(1,1), c varchar(1))
insert @Numbers (c)
select substring(string,n,1)
from (select @str string) a
cross join
(--any number table, better if it exists
--this one is derived with just 32 numbers
select n1+n2+n3+n4+n5 n
from
(select 0 n1 union select 1) n1
,(select 0 n2 union select 2) n2
,(select 0 n3 union select 4) n3
,(select 0 n4 union select 8) n4
,(select 0 n5 union select 16) n5
) b
where n <= len(string)
and substring(string,n,1) like '[0-9]'
order by n
select @n = coalesce(@n + c, c)
from @Numbers
order by rowid
return @n
end
------------------------------------------------------------------------
--DDL/DML
set nocount on
declare @address table (Address varchar(200))
insert @address
select '1746 Wellesley' union
select 'P.O. Box 3175' union
select '4141 65th Street' union
select '773 Florecita Terrace' union
select 'ROUTE 4 BOX 454' union
select '709 HIGH TOP RD PO BOX 92' union
select '4970 SABAL PALM BLVD. #208'
-----------------------------------------------------------------------
--Statement to get just the numbers from @address
select address, dbo.fnNumbersFromStr(address) JustTheNumbers
from @Address
Be One with the Optimizer TG |
 |
|
|
nathans
Aged Yak Warrior
USA
933 Posts |
Posted - 06/22/2005 : 15:05:10
|
oops... :)
Try this one, but again, it return the first number encountered followed by a space (or blank as mentioned above). If you specifically want the numbers following the 'Box' then you will have to extend the case function to favor them.
declare @address table (address varchar(100))
insert into @address
select '1746 Wellesley' union
select 'P.O. Box 3175' union
select '4141 65th Street' union
select '773 Florecita Terrace' union
select 'ROUTE 4 BOX 454' union
select '709 HIGH TOP RD PO BOX 92' union
select '4970 SABAL PALM BLVD. #208'
select address,
-- patindex('%[0-9]%', address) as 'start',
-- charindex(space(1), address + space(1), patindex('%[0-9]%', address)) - (patindex('%[0-9]%', address)) as 'length',
case
when patindex('%[^0-9]%' , address) > 0
then substring(address, patindex('%[0-9]%', address), charindex(space(1), address + space(1), patindex('%[0-9]%', address)) - (patindex('%[0-9]%', address)))
else address
end as 'numeric'
from @address
|
Edited by - nathans on 06/22/2005 15:06:17 |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 06/22/2005 : 15:25:55
|
quote: Originally posted by lane0618
I want to be able to return only the numbers from an address field. What is the best way to do this?
Address Field: 1746 Wellesley P.O. Box 3175 4141 65th Street 773 Florecita Terrace
Should return: 1736 3177 415165 770
Is that a type-o?
Why are the resultant numbers different?
Brett
8-)
Hint: Want your questions answered fast? Follow the direction in this link http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx
|
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 06/22/2005 : 16:53:47
|
OK, but that doesn't get them all the numbers...how about
USE Northwind
GO
SET NOCOUNT ON
CREATE TABLE numbers(n int ,[CHAR] char(1))
GO
DECLARE @n int
SET NOCOUNT ON
SELECT @n = 48
WHILE @n < 58
BEGIN
INSERT INTO numbers(n, [CHAR]) SELECT @n, CHAR(@n)
SELECT @n = @n + 1
END
GO
CREATE FUNCTION udf_x(@x varchar(8000))
RETURNS varchar(8000)
AS
BEGIN
DECLARE @n int, @MAX_n int, @z varchar(8000)
DECLARE @y table(pos int, value char(1))
SELECT @MAX_n = LEN(@x), @n = 1, @z = ''
WHILE @N<=@MAX_n
BEGIN
INSERT INTO @y(pos,value) SELECT @n, SUBSTRING(@x,@n,1)
SELECT @n = @n + 1
END
SELECT @z = @z + [CHAR] FROM @y JOIN numbers ON [CHAR] = value ORDER BY pos
RETURN @z
END
GO
SELECT dbo.udf_x('123 Main Apt 456')
GO
SELECT dbo.udf_x(ShipAddress) FROM Orders
GO
SET NOCOUNT OFF
DROP TABLE numbers
DROP FUNCTION udf_x
GO
Brett
8-)
Hint: Want your questions answered fast? Follow the direction in this link http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx
|
 |
|
|
nathans
Aged Yak Warrior
USA
933 Posts |
Posted - 06/22/2005 : 16:57:30
|
Yes, I didnt read closely enough. He just wants to remove non-numerics.
Both udfs look right on.
|
Edited by - nathans on 06/22/2005 16:59:00 |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
nathans
Aged Yak Warrior
USA
933 Posts |
Posted - 06/22/2005 : 17:08:36
|
| Yea, I cant imagine why that function has any real world application. |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
nathans
Aged Yak Warrior
USA
933 Posts |
Posted - 06/22/2005 : 17:23:04
|
quote: You see 2?
I just wanted to give both you and TG credit :)
|
 |
|
|
TG
Flowing Fount of Yak Knowledge
USA
5501 Posts |
Posted - 06/22/2005 : 17:25:04
|
quote: You see 2?
Damn is it 5:00 ALREADY?
Did you start without me?
Did I post in stealth mode again? Or is it just so far up the thread that nobody noticed?
EDIT: Except my MAN nathans. Thanks for the credit!
Be One with the Optimizer TG
|
Edited by - TG on 06/22/2005 17:27:04 |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
Seventhnight
Flowing Fount of Yak Knowledge
USA
2878 Posts |
Posted - 06/23/2005 : 09:29:25
|
Way I see it you can return the numbers without seperating them, or with seperating them:
without separation:
create function dbo.fnNumbersFromStr(@str varchar(8000))
returns varchar(8000)
as
begin
while patindex('%[^0-9]%',@str)>0
Set @str = rtrim(ltrim(replace(@str,substring(@str,patindex('%[^0-9]%',@str),1),'')))
return @str
end
go
with separation
create function dbo.fnNumbersFromStr(@str varchar(8000))
returns varchar(8000)
as
begin
while patindex('%[^ 0-9]%',@str)>0
Set @str = replace(replace(replace(rtrim(ltrim(replace(@str,substring(@str,patindex('%[^ 0-9]%',@str),1),''))),' ',' þ'),'þ ',''),'þ','')
return @str
end
go
EDIT: after re-reading the original post, no separation is needed, but its interesting 
Corey
 Co-worker on The Wizard of Oz "...those three midgets that came out and danced, the freaked me out when I was little. But they are ok now."  |
Edited by - Seventhnight on 06/23/2005 09:30:17 |
 |
|
|
TG
Flowing Fount of Yak Knowledge
USA
5501 Posts |
Posted - 06/23/2005 : 10:03:19
|
Well sure, if you like simple, elegant code that performs well that will work too.

btw, I think you can loose the ltrim(rtrim))
Be One with the Optimizer TG |
 |
|
|
sfrigard
Starting Member
5 Posts |
Posted - 06/23/2005 : 12:19:18
|
I noticed that you used a UDF to solve this problem. Now for simple numeric extractions this will work fine, there is a significant performance hit involved when processing a lot of records.
On a recent project we had to strip out nonnumeric’s on multi-million record tables. We tested many approaches including UDF’s, Regular Expression(XP call) and Case statements. For example I took the UDF example provided (udf_x) and used it to process a 10 million record table and it took over 4 1/2 hrs before I killed it. Using Case based version took less that 3 minutes to run on the same 10 million record table.
Here is the code using the examples provided.
create table dbo.address (address varchar(100)) insert into address select '1746 Wellesley' union select 'P.O. Box 3175' union select '4141 65th Street' union select '773 Florecita Terrace' union select 'ROUTE 4 BOX 454' union select '709 HIGH TOP RD PO BOX 92' union select '4970 SABAL PALM BLVD. #208' go select case patindex('%[0-9]%', left(isnull(address,''), 38)) when 0 then convert(varchar(38), '') else case when patindex('[0-9]', substring(address, 1, 1)) = 1 then substring(address, 1, 1) else '' end + case when patindex('[0-9]', substring(address, 2, 1)) = 1 then substring(address, 2, 1) else '' end + case when patindex('[0-9]', substring(address, 3, 1)) = 1 then substring(address, 3, 1) else '' end + case when patindex('[0-9]', substring(address, 4, 1)) = 1 then substring(address, 4, 1) else '' end + case when patindex('[0-9]', substring(address, 5, 1)) = 1 then substring(address, 5, 1) else '' end + case when patindex('[0-9]', substring(address, 6, 1)) = 1 then substring(address, 6, 1) else '' end + case when patindex('[0-9]', substring(address, 7, 1)) = 1 then substring(address, 7, 1) else '' end + case when patindex('[0-9]', substring(address, 8, 1)) = 1 then substring(address, 8, 1) else '' end + case when patindex('[0-9]', substring(address, 9, 1)) = 1 then substring(address, 9, 1) else '' end + case when patindex('[0-9]', substring(address,10, 1)) = 1 then substring(address,10, 1) else '' end + case when patindex('[0-9]', substring(address,11, 1)) = 1 then substring(address,11, 1) else '' end + case when patindex('[0-9]', substring(address,12, 1)) = 1 then substring(address,12, 1) else '' end + case when patindex('[0-9]', substring(address,13, 1)) = 1 then substring(address,13, 1) else '' end + case when patindex('[0-9]', substring(address,14, 1)) = 1 then substring(address,14, 1) else '' end + case when patindex('[0-9]', substring(address,15, 1)) = 1 then substring(address,15, 1) else '' end + case when patindex('[0-9]', substring(address,16, 1)) = 1 then substring(address,16, 1) else '' end + case when patindex('[0-9]', substring(address,17, 1)) = 1 then substring(address,17, 1) else '' end + case when patindex('[0-9]', substring(address,18, 1)) = 1 then substring(address,18, 1) else '' end + case when patindex('[0-9]', substring(address,19, 1)) = 1 then substring(address,19, 1) else '' end + case when patindex('[0-9]', substring(address,20, 1)) = 1 then substring(address,20, 1) else '' end + case when patindex('[0-9]', substring(address,21, 1)) = 1 then substring(address,21, 1) else '' end + case when patindex('[0-9]', substring(address,22, 1)) = 1 then substring(address,22, 1) else '' end + case when patindex('[0-9]', substring(address,23, 1)) = 1 then substring(address,23, 1) else '' end + case when patindex('[0-9]', substring(address,24, 1)) = 1 then substring(address,24, 1) else '' end + case when patindex('[0-9]', substring(address,25, 1)) = 1 then substring(address,25, 1) else '' end + case when patindex('[0-9]', substring(address,26, 1)) = 1 then substring(address,26, 1) else '' end + case when patindex('[0-9]', substring(address,27, 1)) = 1 then substring(address,27, 1) else '' end + case when patindex('[0-9]', substring(address,28, 1)) = 1 then substring(address,28, 1) else '' end + case when patindex('[0-9]', substring(address,29, 1)) = 1 then substring(address,29, 1) else '' end + case when patindex('[0-9]', substring(address,30, 1)) = 1 then substring(address,30, 1) else '' end + case when patindex('[0-9]', substring(address,31, 1)) = 1 then substring(address,31, 1) else '' end + case when patindex('[0-9]', substring(address,32, 1)) = 1 then substring(address,32, 1) else '' end + case when patindex('[0-9]', substring(address,33, 1)) = 1 then substring(address,33, 1) else '' end + case when patindex('[0-9]', substring(address,34, 1)) = 1 then substring(address,34, 1) else '' end + case when patindex('[0-9]', substring(address,35, 1)) = 1 then substring(address,35, 1) else '' end + case when patindex('[0-9]', substring(address,36, 1)) = 1 then substring(address,36, 1) else '' end + case when patindex('[0-9]', substring(address,37, 1)) = 1 then substring(address,37, 1) else '' end + case when patindex('[0-9]', substring(address,38, 1)) = 1 then substring(address,38, 1) else '' end end as num from dbo.address
|
 |
|
|
Anjel One
Starting Member
1 Posts |
Posted - 01/23/2008 : 14:48:18
|
Just a late coming addendum, I have a real world example: Data cleansing.
We have two columns (generalizing and relabeling here for clarity): "Max Engine Mechanical Output" and "Mechanical Output Units"
Both used to be hand entered and varchar types (bad design if you ask me). Now, we want an int "...Output" column and a constrained "...Units" column.
I have to go back and parse through both columns (where people entered both units and output values in both) and separate.
I'm still designing the solution, but patindex(), substring(), and the trims are definitely how I'm going about it. |
 |
|
|
madhivanan
Premature Yak Congratulator
India
22469 Posts |
|
| |
Topic  |
|
|
|