SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Replace part of string with capitalised equivalent
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

chrissyppppp
Starting Member

6 Posts

Posted - 12/19/2012 :  05:35:16  Show Profile  Reply with Quote
We have an issue with instrument names loaded into our data warehouse where they either loaded with the string all in upper case or in normal case (capital at the beginning of each word). We have a function that deals with changing all of the upper case strings into normal case and an example of some of these are below:

Merck & Co Inc Common Stock Usd.5
Newmont Mining Corp Common Stock Usd Inc 1.6

However there are parts of this which we would like to capitalise when it comes to reporting ie in the examples above 'Usd' should become 'USD', 'Inc' should become 'INC', 'Corp' should become 'CORP'.

These are all held in a cross reference table. I've decided to create a function which an example of which is below:


CREATE TABLE capital ([InternalValue] VARCHAR(255), [ExternalValue] VARCHAR(255));

CREATE TABLE instrument ([Instrument] VARCHAR(255));

INSERT INTO capital ([InternalValue], [ExternalValue])
VALUES
('Usd', 'USD'),
('Ltd', 'LTD'),
('Corp', 'CORP'),
('Inc', 'INC')

INSERT INTO instrument ([Instrument])
VALUES
('Merck & Co Inc Common Stock Usd.5'),
('Newmont Mining Corp Common Stock Usd Inc 1.6');

CREATE FUNCTION [dbo].[fn_replacementsecurityname_test] (@string VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN

    WITH ReplaceWord(InternalWord, ExternalWord) AS
    (
        SELECT InternalValue, ExternalValue
        FROM capital
    )
    SELECT @string =  REPLACE(@string, r.InternalWord, r.ExternalWord)
    FROM ReplaceWord r
    WHERE CHARINDEX(r.InternalWord, @string) > 0

    RETURN @string
END

SELECT dbo.replacement(i.Instrument) NewValue
FROM instrument AS i


So this works fine, however how can I ensure that i'm replacing what i want to replace. ie in the above example say we wanted all "Co" to be "CO" but also ensuring that "Common" didn't become "COmmon"? Is there any way we could look at the length of the replacement string and that it only replaces words of that length?

Any help or examples would be greatly appreciated.

bandi
Flowing Fount of Yak Knowledge

India
2218 Posts

Posted - 12/19/2012 :  07:06:26  Show Profile  Reply with Quote
See this example..
If you have identity column in [instrument] table... Then you can very easily adopt the following query

Just run this script once
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
         DROP TABLE #mytable
     IF OBJECT_ID('TempDB..#replacements','U') IS NOT NULL
         DROP TABLE #replacements

--===== Create the test table with 
 CREATE TABLE #mytable 
        (
        ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
        LongString nvarchar(max)
        )

--===== Create table with replacements
CREATE TABLE #replacements
(
  code nvarchar(55)
 ,replacement nvarchar(255)
)


--===== Add sample data
INSERT INTO #mytable (LongString)
select 'code01 something else code02' union all
select 'code02 code02 aacode01'

INSERT INTO #replacements
select 'code01', 'XXXX' union all
select 'code02', 'YYYY'


-- solution
;WITH Tally AS (SELECT top 100 n = ROW_NUMBER() OVER(ORDER BY [name]) from master.dbo.syscolumns),
NormalisedData AS (
	SELECT m.ID,
		WordID = ROW_NUMBER() OVER(PARTITION BY m.ID ORDER BY t.n), 
		m.longstring, 
		word = SUBSTRING(m.longstring, t.n,
		(ISNULL(NULLIF(CHARINDEX(' ', SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n)), 0), 
		LEN(SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n))+2)) - 1)
	FROM #mytable m
	INNER JOIN Tally t ON SUBSTRING(' ' + m.LongString, t.n, 1) = ' '
	WHERE t.n <= LEN(m.LongString) 
),
ReplacedData AS (
	SELECT d.ID, d.WordID, d.LongString, NewWord = ISNULL(r.replacement, d.word)
	FROM NormalisedData d 
	LEFT JOIN #replacements r ON r.code = d.word
)
SELECT d.ID, OldString = d.LongString,
	NewString = (
		SELECT ' ' + NewWord  
		FROM ReplacedData r
		WHERE r.ID = d.ID
		ORDER BY ID, WordID
		FOR XML PATH(''))
FROM ReplacedData d 
GROUP BY d.ID, d.LongString 


--
Chandu
Go to Top of Page

chrissyppppp
Starting Member

6 Posts

Posted - 12/19/2012 :  07:23:07  Show Profile  Reply with Quote
Thanks for the response. Is there anyway you could convert this into a function? This is what i'm truely after so that i can supply it with a string and it do a lookup to be able to replace the words that need replacing from the cross reference table.

Thanks
Go to Top of Page

bandi
Flowing Fount of Yak Knowledge

India
2218 Posts

Posted - 12/19/2012 :  08:23:55  Show Profile  Reply with Quote
I just converted above query into a function.. You can do better than this..
For time being use following function:

IF OBJECT_ID('mytable','U') IS NOT NULL
         DROP TABLE mytable
     IF OBJECT_ID('replacements','U') IS NOT NULL
         DROP TABLE replacements

--===== Create the test table with 
 CREATE TABLE mytable 
        (
        ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
        LongString nvarchar(max)
        )

--===== Create table with replacements
CREATE TABLE replacements
(
  code nvarchar(55)
 ,replacement nvarchar(255)
)


--===== Add sample data
INSERT INTO mytable (LongString)
select 'code01 something else code02' union all
select 'code02 code02 aacode01'

INSERT INTO replacements
select 'code01', 'XXXX' union all
select 'code02', 'YYYY'
GO
CREATE FUNCTION MyOwnReplace (@str varchar(1000))
RETURNs VARCHAR(2000)
AS
BEGIN
DECLARE @NewString varchar(2000) = ''
;WITH Tally AS 
(SELECT top 100 n = ROW_NUMBER() OVER(ORDER BY [name]) from master.dbo.syscolumns)
,NormalisedData AS (
	SELECT m.ID,
		WordID = ROW_NUMBER() OVER(PARTITION BY m.ID ORDER BY t.n), 
		m.longstring, 
		word = SUBSTRING(m.longstring, t.n,
		(ISNULL(NULLIF(CHARINDEX(' ', SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n)), 0), 
		LEN(SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n))+2)) - 1)
	FROM mytable m 
	INNER JOIN Tally t ON SUBSTRING(' ' + m.LongString, t.n, 1) = ' ' AND m.LongString = @str
	WHERE t.n <= LEN(m.LongString) 
),
ReplacedData AS (
	SELECT d.ID, d.WordID, d.LongString, NewWord = ISNULL(r.replacement, d.word)
	FROM NormalisedData d 
	LEFT JOIN replacements r ON r.code = d.word
)
SELECT @NewString = (
		SELECT ' ' + NewWord  
		FROM ReplacedData r
		WHERE r.ID = d.ID
		ORDER BY ID, WordID
		FOR XML PATH(''))
FROM ReplacedData d 
GROUP BY d.ID, d.LongString 
RETURN @NewString
END
--Testing
SELECT dbo.MyOwnReplace(LongString), LongString
FROM mytable


--
Chandu
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52317 Posts

Posted - 12/19/2012 :  08:47:35  Show Profile  Reply with Quote
quote:
Originally posted by chrissyppppp

Thanks for the response. Is there anyway you could convert this into a function? This is what i'm truely after so that i can supply it with a string and it do a lookup to be able to replace the words that need replacing from the cross reference table.

Thanks


sorry why do you need a function if you already have string to be replaced as values inside another table?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

chrissyppppp
Starting Member

6 Posts

Posted - 12/19/2012 :  08:50:21  Show Profile  Reply with Quote
I simply had the temporary table above as an example. I would prefer to use a function due to performance
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52317 Posts

Posted - 12/19/2012 :  08:56:52  Show Profile  Reply with Quote
quote:
Originally posted by chrissyppppp

I simply had the temporary table above as an example. I would prefer to use a function due to performance


so do you mean you will be calling function for replacing each of the words?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

chrissyppppp
Starting Member

6 Posts

Posted - 12/19/2012 :  09:00:39  Show Profile  Reply with Quote
Yeah so i'll have a stored proc that will call this function and replace any of the words that need capitilising in instrument name column so that the Instrument names look correct in the data extract i'm developing
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52317 Posts

Posted - 12/19/2012 :  09:04:46  Show Profile  Reply with Quote
that means multiple invocations of same function and will be performance hit especially when in case of large datasets.Why not populate a table with all values and do a set based replace in one shot?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

bandi
Flowing Fount of Yak Knowledge

India
2218 Posts

Posted - 12/19/2012 :  09:11:43  Show Profile  Reply with Quote
quote:
So this works fine, however how can I ensure that i'm replacing what i want to replace. ie in the above example say we wanted all "Co" to be "CO" but also ensuring that "Common" didn't become "COmmon"? Is there any way we could look at the length of the replacement string and that it only replaces words of that length?

Hi simple solution for above problem is:
In your first function, do this modification

SELECT @string = REPLACE(@string, ' '+ r.InternalWord + ' ', ' '+r.ExternalWord+ ' ')
FROM ReplaceWord

--
Chandu
Go to Top of Page

chrissyppppp
Starting Member

6 Posts

Posted - 12/19/2012 :  10:05:12  Show Profile  Reply with Quote
quote:
Originally posted by bandi

quote:
So this works fine, however how can I ensure that i'm replacing what i want to replace. ie in the above example say we wanted all "Co" to be "CO" but also ensuring that "Common" didn't become "COmmon"? Is there any way we could look at the length of the replacement string and that it only replaces words of that length?

Hi simple solution for above problem is:
In your first function, do this modification

SELECT @string = REPLACE(@string, ' '+ r.InternalWord + ' ', ' '+r.ExternalWord+ ' ')
FROM ReplaceWord

--
Chandu



This is what i was thinking as a simple solution, but wanted to see if there was a more robust solution
Go to Top of Page

ScottPletcher
Constraint Violating Yak Guru

USA
411 Posts

Posted - 12/19/2012 :  16:51:05  Show Profile  Reply with Quote
I think using mixed case could just cause headaches for searches.

Even though it takes additional storage, I would materialize a column that was all upper case for doing searches, even if you didn't display that column but instead displayed the mixed-case version of the column.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52317 Posts

Posted - 12/19/2012 :  22:07:53  Show Profile  Reply with Quote
case difference would cause search issues only when database collation used is case sensitive. By default sql server collation is case insensitive so case difference doesnt make any issues.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

ScottPletcher
Constraint Violating Yak Guru

USA
411 Posts

Posted - 12/21/2012 :  17:24:39  Show Profile  Reply with Quote
quote:
Originally posted by visakh16

case difference would cause search issues only when database collation used is case sensitive. By default sql server collation is case insensitive so case difference doesnt make any issues.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/




Perhaps. But each db, and even each column, can have a specific collation specified Or if the column value is used in XML
and the XML is searched, case issues will occur again.
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.16 seconds. Powered By: Snitz Forums 2000