| Author |
Topic  |
|
|
RyanRandall
Flowing Fount of Yak Knowledge
United Kingdom
1074 Posts |
Posted - 02/13/2007 : 12:18:09
|
Function which filters a string - i.e. takes a string, and returns the same string with specified characters left or removed. Handles unicode characters if specified.
e.g. '1a2b&3,45 c,>DE.$~99X yZ' -> '1a2b345c99y'
IF OBJECT_ID('fn_FilterString') IS NOT NULL DROP FUNCTION dbo.fn_FilterString
GO
CREATE FUNCTION dbo.fn_FilterString
(@String VARCHAR(8000), @Filter VARCHAR(100), @IsUnicode BIT = 0)
RETURNS VARCHAR(8000)
AS
/*
Returns a string with specified characters left or removed. Run the examples for clarity.
Examples:
DECLARE @s VARCHAR(500)
SET @s = '1a2b&3,45 c,>DE.$~99X yZ'
SELECT dbo.fn_FilterString(@s, '_', 0) --identity
SELECT dbo.fn_FilterString(@s, '[0-9]', 0) --just numbers
SELECT dbo.fn_FilterString(@s, '[^0-9]', 0) --no numbers
SELECT dbo.fn_FilterString(@s, '[a-z]', 0) --just letters
SELECT dbo.fn_FilterString(@s, '[a-z]', 1) --just lower case letters
SELECT dbo.fn_FilterString(@s, '[A-Z]', 1) --just upper case letters
SELECT dbo.fn_FilterString(@s, '[^ ,.]', 0) --removes spaces, commas and full stops
*/
BEGIN
DECLARE @StringLength INT
SET @StringLength = LEN(@String)
DECLARE @t TABLE (i INT IDENTITY(1, 1), c CHAR(1))
INSERT @t SELECT SUBSTRING(@String, number, 1) FROM dbo.F_TABLE_NUMBER_RANGE(0, @StringLength) ORDER BY number
--> See http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685 for dbo.F_TABLE_NUMBER_RANGE
DECLARE @s VARCHAR(500)
SET @s = ''
SELECT @s = @s + c FROM @t
WHERE (@IsUnicode = 0 AND c LIKE @Filter)
OR (@IsUnicode = 1 AND c COLLATE Latin1_General_BIN LIKE @Filter)
ORDER BY i
RETURN @s
ENDAny ideas for better ways of doing this? 
Ryan Randall www.monsoonmalabar.com London-based IT consultancy
Solutions are easy. Understanding the problem, now, that's the hard part. |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29156 Posts |
Posted - 02/13/2007 : 12:51:07
|
Is 40 times faster good enough?  My suggestion is also case sensitiveCREATE TABLE #Test
(
fuid VARCHAR(36),
nuid nvarchar(36)
)
DECLARE @i INT
SELECT @i = 0
WHILE @i < 10000
BEGIN
INSERT #Test
(
fuid
)
SELECT CAST(NEWID() AS VARCHAR(36))
SET @i = @i + 1
END
UPDATE #Test
SET fuid = LOWER(LEFT(fuid, 18)) + UPPER(RIGHT(fuid, 18))
UPDATE #Test
SET nuid = cast(fuid as nvarchar(36))
select * from #test
go
declare @dt datetime
select @dt = getdate()
select max(dbo.fnfilterstring(fuid, '[A-Cd-f]')),
max(dbo.fnfilterstring(nuid, '[A-Cd-f]'))
from #test
select 'Peso ', datediff(ms, @dt, getdate())
go
declare @dt datetime
select @dt = getdate()
select max(dbo.fn_filterstring(fuid, '[A-Cd-f]', 0)),
max(dbo.fn_filterstring(nuid, '[A-Cd-f]', 0))
from #test
select 'Ryan ', datediff(ms, @dt, getdate())
go
drop table #test Here is my suggestion for the functionCREATE FUNCTION dbo.fnFilterString
(
@String VARCHAR(8000),
@Filter VARCHAR(100)
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @Index SMALLINT
SET @Index = DATALENGTH(@String)
WHILE @Index > 0
IF SUBSTRING(@String, @Index, 1) COLLATE Latin1_General_BIN LIKE @Filter
SET @Index = @Index - 1
ELSE
SELECT @String = STUFF(@String, @Index, 1, ''),
@Index = @Index - 1
RETURN @String
END Peter Larsson Helsingborg, Sweden |
Edited by - SwePeso on 02/13/2007 13:06:51 |
 |
|
|
RyanRandall
Flowing Fount of Yak Knowledge
United Kingdom
1074 Posts |
Posted - 02/13/2007 : 13:05:35
|
No fair, Peso! 
I wrote this function years ago in SQL 2000, but I noticed the 'loop trick' didn't work directly in 2005 without involving a temporary table - so I rewrote it using one. And surprise, surprise, that's what slows it down.
Good work on the better version 
Ryan Randall www.monsoonmalabar.com London-based IT consultancy
Solutions are easy. Understanding the problem, now, that's the hard part. |
 |
|
|
RyanRandall
Flowing Fount of Yak Knowledge
United Kingdom
1074 Posts |
Posted - 02/13/2007 : 13:18:01
|
By the way, Peso, have you come across this or something like it yet...
DECLARE @s VARCHAR(10)
SET @s = ''
SELECT @s = @s + SUBSTRING('a1b2c3d4e5', number, 1) FROM dbo.F_TABLE_NUMBER_RANGE(0, 10) ORDER BY number
SELECT @s
--SQL 2000 = 'a1b2c3d4e5'
--SQL 2005 = '5' Have you seen any discussions on it?
Ryan Randall www.monsoonmalabar.com London-based IT consultancy
Solutions are easy. Understanding the problem, now, that's the hard part. |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29156 Posts |
Posted - 02/13/2007 : 13:24:07
|
Actually I have another improvement. If there is not valid filter match, what should the faulty character be replaced with? An empty string or something else? There is also an option to replace all redundant replacement characters.ALTER FUNCTION [dbo].[fnFilterString]
(
@Source VARCHAR(8000),
@Filter VARCHAR(8000),
@Replacement CHAR(1),
@SingleBetweenReplacementOnly BIT
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @Index SMALLINT
SET @Index = DATALENGTH(@Source)
WHILE @Index > 0
IF SUBSTRING(@Source, @Index, 1) COLLATE Latin1_General_BIN LIKE @Filter
SET @Index = @Index - 1
ELSE
SELECT @Source = STUFF(@Source, @Index, 1, @Replacement),
@Index = @Index - 1
IF @SingleBetweenReplacementOnly = 1
BEGIN
WHILE CHARINDEX(@Replacement + @Replacement, @Source) > 0
SET @Source = REPLACE(@Source, @Replacement + @Replacement, @Replacement)
IF LEFT(@Source, 1) = @Replacement
SET @Source = STUFF(@Source, 1, 1, '')
IF RIGHT(@Source, 1) = @Replacement
SET @Source = STUFF(@Source, DATALENGTH(@Source), 1, '')
END
RETURN @Source
END Use this new version with this test code!declare @s varchar(500)
select @s = 'Peso' + char(10) + 'Helsingborg' + CHAR(13) + 'Sweden' + CHAR(13) + CHAR(10) + 'SQL'
print @s
print ''
print dbo.fnFilterString(@s, '[^' + CHAR(13) + CHAR(10) + ']', '_', 0)
print dbo.fnFilterString(@s, '[e]', '*', 1)
print dbo.fnFilterString(@s, '[^n]', '', 0) Peter Larsson Helsingborg, Sweden |
Edited by - SwePeso on 04/20/2007 02:30:44 |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29156 Posts |
Posted - 02/13/2007 : 13:29:04
|
quote: Originally posted by RyanRandall
Have you seen any discussions on it?
Yes, there was one here at SQLTeam some time ago. I remember RockMoose and Kristen were involved. However, this works in SQL Server 2005
DECLARE @s VARCHAR(10)
SET @s = ''
SELECT @s = @s + SUBSTRING('a1b2c3d4e5', number, 1) FROM dbo.F_TABLE_NUMBER_RANGE(1, 10) ORDER BY number
SELECT @s
go
DECLARE @s VARCHAR(10)
SET @s = ''
SELECT @s = @s + z
from (select top 100 percent SUBSTRING('a1b2c3d4e5', number, 1) as z FROM dbo.F_TABLE_NUMBER_RANGE(1, 10) ORDER BY number) as d
SELECT @s
Peter Larsson Helsingborg, Sweden |
 |
|
|
RyanRandall
Flowing Fount of Yak Knowledge
United Kingdom
1074 Posts |
Posted - 02/13/2007 : 14:09:08
|
quote: However, this works in SQL Server 2005
Excellent. Thank you! 
Ryan Randall www.monsoonmalabar.com London-based IT consultancy
Solutions are easy. Understanding the problem, now, that's the hard part. |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29156 Posts |
Posted - 04/14/2008 : 16:08:57
|
A slight variationCREATE FUNCTION dbo.fnExtractDigitSequence
(
@ColumnDelimiter CHAR(1) = ',',
@Filter VARCHAR(8000) = '[0-9]',
@UserData VARCHAR(8000) = NULL
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @Position INT,
@ValidChar TINYINT,
@LastValidChar TINYINT
SELECT @Position = LEN(@UserData),
@LastValidChar = 0
WHILE @Position > 0
SELECT @ValidChar = CASE
WHEN SUBSTRING(@UserData, @Position, 1) COLLATE Latin1_General_BIN LIKE @Filter THEN 1
ELSE 0
END,
@UserData = CASE
WHEN @ValidChar = 1 THEN @UserData
WHEN @LastValidChar = 0 THEN STUFF(@UserData, @Position, 1, '')
ELSE STUFF(@UserData, @Position, 1, @ColumnDelimiter)
END,
@LastValidChar = @ValidChar,
@Position = @Position - 1
RETURN CASE
WHEN @UserData LIKE @ColumnDelimiter + '%' THEN SUBSTRING(@UserData, 2, 7999)
ELSE NULLIF(@UserData, '')
END
END You can call withSELECT dbo.fnExtractDigitSequence(default, '[0-9]', '...,,,,,123456789,,,,,,.........876564532$$££^^(.........,,,,,')
SELECT dbo.fnExtractDigitSequence(',', default, '...,,,,,123456789,,,,,,.........876564532$$££^^(.........,,,,,')
SELECT dbo.fnExtractDigitSequence(default, default, '...,,,,,123456789,,,,,,.........876564532$$££^^(.........,,,,,')
E 12°55'05.25" N 56°04'39.16" |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29156 Posts |
Posted - 11/12/2008 : 06:57:29
|
If you only want numeric characters from a string?DECLARE @Value NVARCHAR(200)
SET @Value = 'a+(6aaaa02.......()/)(/)=(#¤NIOU#H¤CÖIÖERFÖNKER)123456789/fffffffffffff'
WHILE @Value LIKE '%[^0-9]%'
SET @Value = REPLACE(@Value, SUBSTRING(@Value, PATINDEX('%[^0-9]%', @Value), 1), '')
SELECT @Value
E 12°55'05.63" N 56°04'39.26" |
 |
|
|
madhivanan
Premature Yak Congratulator
India
22469 Posts |
|
|
SwePeso
Patron Saint of Lost Yaks
Sweden
29156 Posts |
Posted - 11/12/2008 : 08:43:37
|
A bit different.
1) I use no tally table where you do 2) I only replace a character once where you loop all characters and check them one by one
E 12°55'05.63" N 56°04'39.26" |
 |
|
|
amachanic
SQL Server MVP
USA
169 Posts |
|
| |
Topic  |
|
|
|