| Author |
Topic |
|
cipriani1984
Constraint Violating Yak Guru
304 Posts |
Posted - 2011-09-07 : 09:42:45
|
| Hi,Im trying to write a process that would clean up phone numbers in order to match them up with another table to gain more data.CREATE FUNCTION dbo.get_cleanPhoneNumber_fn (@Phone VARCHAR(20))RETURNS VARCHAR(10)BEGIN DECLARE @retval VARCHAR(10) DECLARE @i TINYINT DECLARE @j TINYINT SET @j = LEN(@Phone) SET @i = 1 SET @retval = '' WHILE @i <= @j AND LEN(@retval) < 10 BEGIN IF LEFT(RIGHT(@Phone,@i),1) IN ('0','1','2','3','4','5','6','7','8','9') SET @retval = LEFT(RIGHT(@Phone,@i),1) + @retval SET @i = @i + 1 END RETURN @retvalENDThen use the following query:SELECT CustomerNumber, dbo.get_cleanPhoneNumber_fn(mobilePhoneIndicator) AS cleanPhoneNumber INTO #customers FROM CurrentNameAddress;The currentNameAddress table is quite large and the whole process takes 5 minutes to complete.Any ideas on how to improve performance through code?Thanks |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-09-07 : 09:53:04
|
[code]ALTER FUNCTION dbo.fnGetCleanPhoneNumber( @Phone VARCHAR(20))RETURNS VARCHAR(20)ASBEGIN RETURN ( CASE WHEN SUBSTRING(@Phone, 1, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 1, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 2, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 2, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 3, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 3, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 4, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 4, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 5, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 5, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 6, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 6, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 7, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 7, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 8, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 8, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 9, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 9, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 10, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 10, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 11, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 11, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 12, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 12, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 13, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 13, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 14, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 14, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 15, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 15, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 16, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 16, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 17, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 17, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 18, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 18, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 19, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 19, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 20, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 20, 1) ELSE '' END )END[/code] N 56°04'39.26"E 12°55'05.63" |
 |
|
|
CoderMan1
Starting Member
6 Posts |
Posted - 2011-09-07 : 10:01:07
|
| what is your typical input and your desired output? seems like you could just use REPLACE to get rid of characters that would throw off your matching, like (, ), -, and spaces |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-09-07 : 10:17:11
|
I would start withRETURN CASE WHEN @Phone NOT LIKE '%[^0-9]%' THEN @Phone ELSE ... longer, slower code ... so that numbers that are already just-digits will be returnedBetter still would be to use that test externally to avoid calling the function if the number is already clean. |
 |
|
|
cipriani1984
Constraint Violating Yak Guru
304 Posts |
Posted - 2011-09-07 : 10:31:02
|
The problem is the field contains alot of non numeric characters aswell as rogue characters, which causes a problem quote: Originally posted by CoderMan1 what is your typical input and your desired output? seems like you could just use REPLACE to get rid of characters that would throw off your matching, like (, ), -, and spaces
|
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-09-07 : 10:36:01
|
| So is Peso's code faster?If there are NO rows that are already "clean", or you've already filtered them out, then ignore my suggestion. |
 |
|
|
cipriani1984
Constraint Violating Yak Guru
304 Posts |
Posted - 2011-09-07 : 10:37:35
|
No his code returns less results than mine :sbut yes it is faster.quote: Originally posted by Kristen So is Peso's code faster?If there are NO rows that are already "clean", or you've already filtered them out, then ignore my suggestion.
|
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-09-07 : 11:07:03
|
| "his code returns less results than mine"Fewer phone digits? or fewer rows?I can't see how Peso's code would change the number rows displayed (unless you have something in your WHERE clause that is excluding some of the cleaned-up records for some reason? Perhaps rows where there were no numeric digits at all?) |
 |
|
|
cipriani1984
Constraint Violating Yak Guru
304 Posts |
Posted - 2011-09-07 : 11:19:01
|
it returns less resultsquote: Originally posted by Kristen "his code returns less results than mine"Fewer phone digits? or fewer rows?I can't see how Peso's code would change the number rows displayed (unless you have something in your WHERE clause that is excluding some of the cleaned-up records for some reason? Perhaps rows where there were no numeric digits at all?)
|
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-09-07 : 11:57:20
|
| Yeah, you said "less results" before, I don't know what you mean though.Pesos code just takes a string parameter and returns a, modified, string value. It doesn't prevent rows from being included. So what do you mean by "less results"? |
 |
|
|
cipriani1984
Constraint Violating Yak Guru
304 Posts |
Posted - 2011-09-07 : 12:10:01
|
My code ran for 5 minutes and produced 4k resultspeso's code ran for 2 minutes and produced 1.7k results.found less matches than my codethats what i meanquote: Originally posted by Kristen Yeah, you said "less results" before, I don't know what you mean though.Pesos code just takes a string parameter and returns a, modified, string value. It doesn't prevent rows from being included. So what do you mean by "less results"?
|
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-09-07 : 14:07:55
|
Post full query, please! N 56°04'39.26"E 12°55'05.63" |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-09-07 : 14:15:36
|
You can use this faster approach too, by changing the scalar function to a table function.DROP FUNCTION dbo.fnGetCleanPhoneNumberGOCREATE FUNCTION dbo.fnGetCleanPhoneNumber( @Phone VARCHAR(20))RETURNS TABLEASRETURN ( SELECT CASE WHEN SUBSTRING(@Phone, 1, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 1, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 2, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 2, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 3, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 3, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 4, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 4, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 5, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 5, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 6, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 6, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 7, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 7, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 8, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 8, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 9, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 9, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 10, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 10, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 11, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 11, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 12, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 12, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 13, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 13, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 14, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 14, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 15, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 15, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 16, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 16, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 17, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 17, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 18, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 18, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 19, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 19, 1) ELSE '' END + CASE WHEN SUBSTRING(@Phone, 20, 1) LIKE '[0-9]' THEN SUBSTRING(@Phone, 20, 1) ELSE '' END AS PhoneNumber ) Then your code look like this insteadSELECT cna.CustomerNumber, f.PhoneNumber AS CleanPhoneNumberINTO #CustomersFROM CurrentNameAddress AS cnaOUTER APPLY dbo.fnGetCleanPhoneNumber(cna.MobilePhoneIndicator) AS f How long time does it take now? N 56°04'39.26"E 12°55'05.63" |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-09-08 : 01:24:18
|
quote: Originally posted by cipriani1984 My code ran for 5 minutes and produced 4k resultspeso's code ran for 2 minutes and produced 1.7k results.found less matches than my codethats what i mean
OK, I understand now.Can you post a value where there was a difference between your code and Peso's please?Original value, your value, Peso's value are what we need to see. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-09-08 : 01:27:56
|
| Note that Peso's function returns 20 characters, yours only returns 10 - so if the number is more than 10 digits yours will be truncating the result. Maybe there are some entries with >10 digits? - not real phone numbers, obviously ... |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-09-08 : 02:13:42
|
If you want the first 10 digits to be returned, change my function from "RETURNS VARCHAR(20)" to "RETURNS VARCHAR(10)". N 56°04'39.26"E 12°55'05.63" |
 |
|
|
|
|
|