SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 General SQL Server Forums
 Script Library
 Filtering a string
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

RyanRandall
Flowing Fount of Yak Knowledge

United Kingdom
1074 Posts

Posted - 02/13/2007 :  12:18:09  Show Profile  Reply with Quote
Function which filters a string - i.e. takes a string, and returns the same string with specified characters left or removed. Handles unicode characters if specified.

e.g. '1a2b&3,45 c,>DE.$~99X yZ' -> '1a2b345c99y'

IF OBJECT_ID('fn_FilterString') IS NOT NULL DROP FUNCTION dbo.fn_FilterString
GO
CREATE FUNCTION dbo.fn_FilterString
    (@String VARCHAR(8000), @Filter VARCHAR(100), @IsUnicode BIT = 0)
RETURNS VARCHAR(8000)
AS
/*
Returns a string with specified characters left or removed. Run the examples for clarity.

Examples:

DECLARE @s VARCHAR(500)
SET @s = '1a2b&3,45 c,>DE.$~99X yZ'

SELECT dbo.fn_FilterString(@s, '_', 0) --identity
SELECT dbo.fn_FilterString(@s, '[0-9]', 0) --just numbers
SELECT dbo.fn_FilterString(@s, '[^0-9]', 0) --no numbers
SELECT dbo.fn_FilterString(@s, '[a-z]', 0) --just letters
SELECT dbo.fn_FilterString(@s, '[a-z]', 1) --just lower case letters
SELECT dbo.fn_FilterString(@s, '[A-Z]', 1) --just upper case letters
SELECT dbo.fn_FilterString(@s, '[^ ,.]', 0) --removes spaces, commas and full stops
*/
BEGIN

DECLARE @StringLength INT
SET @StringLength = LEN(@String)

DECLARE @t TABLE (i INT IDENTITY(1, 1), c CHAR(1))
INSERT @t SELECT SUBSTRING(@String, number, 1) FROM dbo.F_TABLE_NUMBER_RANGE(0, @StringLength) ORDER BY number
--> See http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685 for dbo.F_TABLE_NUMBER_RANGE

DECLARE @s VARCHAR(500)
SET @s = ''

SELECT @s = @s + c FROM @t
WHERE (@IsUnicode = 0 AND c LIKE @Filter)
   OR (@IsUnicode = 1 AND c COLLATE Latin1_General_BIN LIKE @Filter)
ORDER BY i

RETURN @s

END
Any ideas for better ways of doing this?


Ryan Randall
www.monsoonmalabar.com London-based IT consultancy

Solutions are easy. Understanding the problem, now, that's the hard part.

SwePeso
Patron Saint of Lost Yaks

Sweden
30282 Posts

Posted - 02/13/2007 :  12:51:07  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Is 40 times faster good enough?
My suggestion is also case sensitive
CREATE TABLE	#Test
		(
			fuid VARCHAR(36),
			nuid nvarchar(36)
		)

DECLARE	@i INT

SELECT	@i = 0

WHILE @i < 10000
	BEGIN
		INSERT	#Test
			(
				fuid
			)
		SELECT	CAST(NEWID() AS VARCHAR(36))

		SET @i = @i + 1
	END

UPDATE	#Test
SET	fuid = LOWER(LEFT(fuid, 18)) + UPPER(RIGHT(fuid, 18))

UPDATE	#Test
SET	nuid = cast(fuid as nvarchar(36))

select * from #test

go
declare @dt datetime
select @dt = getdate()

select	max(dbo.fnfilterstring(fuid, '[A-Cd-f]')),
	max(dbo.fnfilterstring(nuid, '[A-Cd-f]'))
from	#test

select 'Peso ', datediff(ms, @dt, getdate())

go
declare @dt datetime
select @dt = getdate()

select	max(dbo.fn_filterstring(fuid, '[A-Cd-f]', 0)),
	max(dbo.fn_filterstring(nuid, '[A-Cd-f]', 0))
from	#test

select 'Ryan ', datediff(ms, @dt, getdate())
go

drop table #test

Here is my suggestion for the function
CREATE FUNCTION dbo.fnFilterString
(
	@String VARCHAR(8000),
	@Filter VARCHAR(100)
)
RETURNS VARCHAR(8000)
AS
BEGIN
	DECLARE	@Index SMALLINT

	SET	@Index = DATALENGTH(@String)

	WHILE @Index > 0
		IF SUBSTRING(@String, @Index, 1) COLLATE Latin1_General_BIN LIKE @Filter
			SET	@Index = @Index - 1
		ELSE
			SELECT	@String = STUFF(@String, @Index, 1, ''),
				@Index =  @Index - 1

	RETURN @String
END

Peter Larsson
Helsingborg, Sweden

Edited by - SwePeso on 02/13/2007 13:06:51
Go to Top of Page

RyanRandall
Flowing Fount of Yak Knowledge

United Kingdom
1074 Posts

Posted - 02/13/2007 :  13:05:35  Show Profile  Reply with Quote
No fair, Peso!

I wrote this function years ago in SQL 2000, but I noticed the 'loop trick' didn't work directly in 2005 without involving a temporary table - so I rewrote it using one. And surprise, surprise, that's what slows it down.

Good work on the better version


Ryan Randall
www.monsoonmalabar.com London-based IT consultancy

Solutions are easy. Understanding the problem, now, that's the hard part.
Go to Top of Page

RyanRandall
Flowing Fount of Yak Knowledge

United Kingdom
1074 Posts

Posted - 02/13/2007 :  13:18:01  Show Profile  Reply with Quote
By the way, Peso, have you come across this or something like it yet...

DECLARE @s VARCHAR(10)
SET @s = ''
SELECT @s = @s + SUBSTRING('a1b2c3d4e5', number, 1) FROM dbo.F_TABLE_NUMBER_RANGE(0, 10) ORDER BY number
SELECT @s

--SQL 2000 = 'a1b2c3d4e5'
--SQL 2005 = '5'

Have you seen any discussions on it?


Ryan Randall
www.monsoonmalabar.com London-based IT consultancy

Solutions are easy. Understanding the problem, now, that's the hard part.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30282 Posts

Posted - 02/13/2007 :  13:24:07  Show Profile  Visit SwePeso's Homepage  Reply with Quote
Actually I have another improvement. If there is not valid filter match, what should the faulty character be replaced with?
An empty string or something else? There is also an option to replace all redundant replacement characters.
ALTER FUNCTION [dbo].[fnFilterString]
(
	@Source VARCHAR(8000),
	@Filter VARCHAR(8000),
	@Replacement CHAR(1),
	@SingleBetweenReplacementOnly BIT
)
RETURNS VARCHAR(8000)
AS
BEGIN
	DECLARE	@Index SMALLINT

	SET	@Index = DATALENGTH(@Source)

	WHILE @Index > 0
		IF SUBSTRING(@Source, @Index, 1) COLLATE Latin1_General_BIN LIKE @Filter
			SET	@Index = @Index - 1
		ELSE
			SELECT	@Source = STUFF(@Source, @Index, 1, @Replacement),
				@Index =  @Index - 1

	IF @SingleBetweenReplacementOnly = 1
		BEGIN
			WHILE CHARINDEX(@Replacement + @Replacement, @Source) > 0
				SET @Source = REPLACE(@Source, @Replacement + @Replacement, @Replacement)

			IF LEFT(@Source, 1) = @Replacement
				SET @Source = STUFF(@Source, 1, 1, '')

			IF RIGHT(@Source, 1) = @Replacement
				SET @Source = STUFF(@Source, DATALENGTH(@Source), 1, '')
		END

	RETURN @Source
END
Use this new version with this test code!
declare @s varchar(500)

select @s = 'Peso' + char(10) + 'Helsingborg' + CHAR(13) + 'Sweden' + CHAR(13) + CHAR(10) + 'SQL'

print @s

print ''

print dbo.fnFilterString(@s, '[^' + CHAR(13) + CHAR(10) + ']', '_', 0)
print dbo.fnFilterString(@s, '[e]', '*', 1)
print dbo.fnFilterString(@s, '[^n]', '', 0)

Peter Larsson
Helsingborg, Sweden

Edited by - SwePeso on 04/20/2007 02:30:44
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30282 Posts

Posted - 02/13/2007 :  13:29:04  Show Profile  Visit SwePeso's Homepage  Reply with Quote
quote:
Originally posted by RyanRandall

Have you seen any discussions on it?
Yes, there was one here at SQLTeam some time ago. I remember RockMoose and Kristen were involved.
However, this works in SQL Server 2005

DECLARE @s VARCHAR(10)
SET @s = ''
SELECT @s = @s + SUBSTRING('a1b2c3d4e5', number, 1) FROM dbo.F_TABLE_NUMBER_RANGE(1, 10) ORDER BY number
SELECT @s
go

DECLARE @s VARCHAR(10)
SET @s = ''
SELECT @s = @s + z
from (select top 100 percent SUBSTRING('a1b2c3d4e5', number, 1) as z FROM dbo.F_TABLE_NUMBER_RANGE(1, 10) ORDER BY number) as d

SELECT @s


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

RyanRandall
Flowing Fount of Yak Knowledge

United Kingdom
1074 Posts

Posted - 02/13/2007 :  14:09:08  Show Profile  Reply with Quote
quote:
However, this works in SQL Server 2005
Excellent. Thank you!

Ryan Randall
www.monsoonmalabar.com London-based IT consultancy

Solutions are easy. Understanding the problem, now, that's the hard part.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30282 Posts

Posted - 04/14/2008 :  16:08:57  Show Profile  Visit SwePeso's Homepage  Reply with Quote
A slight variation
CREATE FUNCTION dbo.fnExtractDigitSequence
(
	@ColumnDelimiter CHAR(1) = ',',
	@Filter VARCHAR(8000) = '[0-9]',
	@UserData VARCHAR(8000) = NULL
)
RETURNS VARCHAR(8000)
AS
BEGIN

DECLARE	@Position INT,
	@ValidChar TINYINT,
	@LastValidChar TINYINT

SELECT	@Position = LEN(@UserData),
	@LastValidChar = 0

WHILE @Position > 0
	SELECT	@ValidChar =	CASE
					WHEN SUBSTRING(@UserData, @Position, 1) COLLATE Latin1_General_BIN LIKE @Filter THEN 1
					ELSE 0
				END,
		@UserData =	CASE
					WHEN @ValidChar = 1 THEN @UserData
					WHEN @LastValidChar = 0 THEN STUFF(@UserData, @Position, 1, '')
					ELSE STUFF(@UserData, @Position, 1, @ColumnDelimiter)
				END,
		@LastValidChar = @ValidChar,
		@Position = @Position - 1

	RETURN	CASE
			WHEN @UserData LIKE @ColumnDelimiter + '%' THEN SUBSTRING(@UserData, 2, 7999)
			ELSE NULLIF(@UserData, '')
		END
END
You can call with
SELECT	dbo.fnExtractDigitSequence(default, '[0-9]', '...,,,,,123456789,,,,,,.........876564532$$££^^(.........,,,,,')
SELECT	dbo.fnExtractDigitSequence(',', default, '...,,,,,123456789,,,,,,.........876564532$$££^^(.........,,,,,')
SELECT	dbo.fnExtractDigitSequence(default, default, '...,,,,,123456789,,,,,,.........876564532$$££^^(.........,,,,,')




E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30282 Posts

Posted - 11/12/2008 :  06:57:29  Show Profile  Visit SwePeso's Homepage  Reply with Quote
If you only want numeric characters from a string?
DECLARE @Value NVARCHAR(200)

SET     @Value = 'a+(6aaaa02.......()/)(/)=(#¤NIOU#H¤CÖIÖERFÖNKER)123456789/fffffffffffff'

WHILE @Value LIKE '%[^0-9]%'
        SET     @Value = REPLACE(@Value, SUBSTRING(@Value, PATINDEX('%[^0-9]%', @Value), 1), '')

SELECT  @Value



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

madhivanan
Premature Yak Congratulator

India
22772 Posts

Posted - 11/12/2008 :  08:13:00  Show Profile  Send madhivanan a Yahoo! Message  Reply with Quote
http://sqlblogcasts.com/blogs/madhivanan/archive/2007/12/18/extract-only-numbers-from-a-string.aspx

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

Sweden
30282 Posts

Posted - 11/12/2008 :  08:43:37  Show Profile  Visit SwePeso's Homepage  Reply with Quote
A bit different.

1) I use no tally table where you do
2) I only replace a character once where you loop all characters and check them one by one


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

amachanic
SQL Server MVP

USA
169 Posts

Posted - 11/12/2008 :  19:02:00  Show Profile  Visit amachanic's Homepage  Reply with Quote
And another version, designed to properly mimic the behavior of REPLACE in all cases:

http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/pattern-based-replacement-udf.aspx

---
Adam Machanic
SQL Server MVP
http://www.datamanipulation.net
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.12 seconds. Powered By: Snitz Forums 2000