Replace Duplicates - General Purpose UDF

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Replace Duplicates - General Purpose UDF

Author

Topic

Kristen
Test

22859 Posts

Posted - 2006-02-15 : 03:22:32

I've been having a look at Corey's code to replace multiple spaces with a single space:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=56195

Is there any reason why:


Replace(Replace(Replace(myCol, ' ', ' þ'), 'þ ', ''), 'þ', '')

should not have a DOUBLE expression in the first replace?


Replace(Replace(Replace(myCol, '  ', ' þ'), 'þ ', ''), 'þ', '')
                                ^^

otherwise I'm concerned that a) replacing a single space with ' þ' might cause an overflow and b) that I will be doing replaces of single instances where actually there is no work to do!

I was then planning to use my Tally table to find the first character not used in the string, and use that for the "marker", and thus make a general purpose "replace duplicates" UDF.

What's the most efficient way to do that?


SELECT TOP 1 @charMarker = CHAR(MyNumber)
FROM	dbo.MyTallyTable
WHERE	    mb_int_ID >= 65		-- Start at 'A' for this test
	AND CHARINDEX(CHAR(MyNumber), @strData) = 0

or should I use


	AND @strData NOT LIKE '%' + CHAR(MyNumber) + '%'

or something else?

Do I need to start at ASCII 65 ('A') or are the ASCII values 0-31 just fine?

Thanks

Kristen

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2006-02-16 : 06:22:34

I'm sure on the previous occasions where this method was posted here, the first search string did have two spaces in it. As you say, this prevents the possibility that the intermediate result will exceed 8000 characters.

quote:

Do I need to start at ASCII 65 ('A') or are the ASCII values 0-31 just fine?

They should be. They're only characters*, after all! I suspect the number of CHAR(0) characters in most databases is vanishingly small, and that most of them are indicative of errors in the software inserting them or that the column type should have been varbinary.

* Modulo (un)usual collation issues (aargh!)


SELECT n,
  REPLACE(s COLLATE Latin1_General_CS_AS, CHAR(n), ''),
  CHARINDEX(s COLLATE Latin1_General_CS_AS, CHAR(n)),
  REPLACE(s COLLATE SQL_Latin1_General_Cp1_CS_AS, CHAR(n), ''),
  CHARINDEX(s COLLATE SQL_Latin1_General_Cp1_CS_AS, CHAR(n)),
  REPLACE(s COLLATE Latin1_General_BIN, CHAR(n), ''),
  CHARINDEX(s COLLATE Latin1_General_BIN, CHAR(n))
FROM (SELECT 'Test string' AS s) AS A, Numbers
WHERE n BETWEEN 0 AND 31
ORDER BY n

Hmm, seems to only be CHAR(0) that behaves strangely (i.e. wrong)... now I wonder why that might be (bloody C programmers!)

Kristen
Test

22859 Posts

Posted - 2006-02-16 : 07:04:56

"bloody C programmers!"

I wonder how many different NULL jokes there are in the world!

"Modulo (un)usual collation issues (aargh!)"

Aargh indeed

Is ASCII 1-127, inclusive, "safe" then? and 128-255 potentially "dangerous"?

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources