| Author |
Topic  |
|
|
chrissyppppp
Starting Member
5 Posts |
Posted - 12/19/2012 : 05:35:16
|
We have an issue with instrument names loaded into our data warehouse where they either loaded with the string all in upper case or in normal case (capital at the beginning of each word). We have a function that deals with changing all of the upper case strings into normal case and an example of some of these are below:
Merck & Co Inc Common Stock Usd.5 Newmont Mining Corp Common Stock Usd Inc 1.6
However there are parts of this which we would like to capitalise when it comes to reporting ie in the examples above 'Usd' should become 'USD', 'Inc' should become 'INC', 'Corp' should become 'CORP'.
These are all held in a cross reference table. I've decided to create a function which an example of which is below:
CREATE TABLE capital ([InternalValue] VARCHAR(255), [ExternalValue] VARCHAR(255));
CREATE TABLE instrument ([Instrument] VARCHAR(255));
INSERT INTO capital ([InternalValue], [ExternalValue])
VALUES
('Usd', 'USD'),
('Ltd', 'LTD'),
('Corp', 'CORP'),
('Inc', 'INC')
INSERT INTO instrument ([Instrument])
VALUES
('Merck & Co Inc Common Stock Usd.5'),
('Newmont Mining Corp Common Stock Usd Inc 1.6');
CREATE FUNCTION [dbo].[fn_replacementsecurityname_test] (@string VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
WITH ReplaceWord(InternalWord, ExternalWord) AS
(
SELECT InternalValue, ExternalValue
FROM capital
)
SELECT @string = REPLACE(@string, r.InternalWord, r.ExternalWord)
FROM ReplaceWord r
WHERE CHARINDEX(r.InternalWord, @string) > 0
RETURN @string
END
SELECT dbo.replacement(i.Instrument) NewValue
FROM instrument AS i
So this works fine, however how can I ensure that i'm replacing what i want to replace. ie in the above example say we wanted all "Co" to be "CO" but also ensuring that "Common" didn't become "COmmon"? Is there any way we could look at the length of the replacement string and that it only replaces words of that length?
Any help or examples would be greatly appreciated. |
|
|
bandi
Flowing Fount of Yak Knowledge
India
1420 Posts |
Posted - 12/19/2012 : 07:06:26
|
See this example.. If you have identity column in [instrument] table... Then you can very easily adopt the following query
Just run this script once
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
DROP TABLE #mytable
IF OBJECT_ID('TempDB..#replacements','U') IS NOT NULL
DROP TABLE #replacements
--===== Create the test table with
CREATE TABLE #mytable
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
LongString nvarchar(max)
)
--===== Create table with replacements
CREATE TABLE #replacements
(
code nvarchar(55)
,replacement nvarchar(255)
)
--===== Add sample data
INSERT INTO #mytable (LongString)
select 'code01 something else code02' union all
select 'code02 code02 aacode01'
INSERT INTO #replacements
select 'code01', 'XXXX' union all
select 'code02', 'YYYY'
-- solution
;WITH Tally AS (SELECT top 100 n = ROW_NUMBER() OVER(ORDER BY [name]) from master.dbo.syscolumns),
NormalisedData AS (
SELECT m.ID,
WordID = ROW_NUMBER() OVER(PARTITION BY m.ID ORDER BY t.n),
m.longstring,
word = SUBSTRING(m.longstring, t.n,
(ISNULL(NULLIF(CHARINDEX(' ', SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n)), 0),
LEN(SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n))+2)) - 1)
FROM #mytable m
INNER JOIN Tally t ON SUBSTRING(' ' + m.LongString, t.n, 1) = ' '
WHERE t.n <= LEN(m.LongString)
),
ReplacedData AS (
SELECT d.ID, d.WordID, d.LongString, NewWord = ISNULL(r.replacement, d.word)
FROM NormalisedData d
LEFT JOIN #replacements r ON r.code = d.word
)
SELECT d.ID, OldString = d.LongString,
NewString = (
SELECT ' ' + NewWord
FROM ReplacedData r
WHERE r.ID = d.ID
ORDER BY ID, WordID
FOR XML PATH(''))
FROM ReplacedData d
GROUP BY d.ID, d.LongString
-- Chandu |
 |
|
|
chrissyppppp
Starting Member
5 Posts |
Posted - 12/19/2012 : 07:23:07
|
Thanks for the response. Is there anyway you could convert this into a function? This is what i'm truely after so that i can supply it with a string and it do a lookup to be able to replace the words that need replacing from the cross reference table.
Thanks |
 |
|
|
bandi
Flowing Fount of Yak Knowledge
India
1420 Posts |
Posted - 12/19/2012 : 08:23:55
|
I just converted above query into a function.. You can do better than this.. For time being use following function:
IF OBJECT_ID('mytable','U') IS NOT NULL
DROP TABLE mytable
IF OBJECT_ID('replacements','U') IS NOT NULL
DROP TABLE replacements
--===== Create the test table with
CREATE TABLE mytable
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
LongString nvarchar(max)
)
--===== Create table with replacements
CREATE TABLE replacements
(
code nvarchar(55)
,replacement nvarchar(255)
)
--===== Add sample data
INSERT INTO mytable (LongString)
select 'code01 something else code02' union all
select 'code02 code02 aacode01'
INSERT INTO replacements
select 'code01', 'XXXX' union all
select 'code02', 'YYYY'
GO
CREATE FUNCTION MyOwnReplace (@str varchar(1000))
RETURNs VARCHAR(2000)
AS
BEGIN
DECLARE @NewString varchar(2000) = ''
;WITH Tally AS
(SELECT top 100 n = ROW_NUMBER() OVER(ORDER BY [name]) from master.dbo.syscolumns)
,NormalisedData AS (
SELECT m.ID,
WordID = ROW_NUMBER() OVER(PARTITION BY m.ID ORDER BY t.n),
m.longstring,
word = SUBSTRING(m.longstring, t.n,
(ISNULL(NULLIF(CHARINDEX(' ', SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n)), 0),
LEN(SUBSTRING(m.longstring, t.n, LEN(m.LongString)-t.n))+2)) - 1)
FROM mytable m
INNER JOIN Tally t ON SUBSTRING(' ' + m.LongString, t.n, 1) = ' ' AND m.LongString = @str
WHERE t.n <= LEN(m.LongString)
),
ReplacedData AS (
SELECT d.ID, d.WordID, d.LongString, NewWord = ISNULL(r.replacement, d.word)
FROM NormalisedData d
LEFT JOIN replacements r ON r.code = d.word
)
SELECT @NewString = (
SELECT ' ' + NewWord
FROM ReplacedData r
WHERE r.ID = d.ID
ORDER BY ID, WordID
FOR XML PATH(''))
FROM ReplacedData d
GROUP BY d.ID, d.LongString
RETURN @NewString
END
--Testing
SELECT dbo.MyOwnReplace(LongString), LongString
FROM mytable
-- Chandu |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47071 Posts |
Posted - 12/19/2012 : 08:47:35
|
quote: Originally posted by chrissyppppp
Thanks for the response. Is there anyway you could convert this into a function? This is what i'm truely after so that i can supply it with a string and it do a lookup to be able to replace the words that need replacing from the cross reference table.
Thanks
sorry why do you need a function if you already have string to be replaced as values inside another table?
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
 |
|
|
chrissyppppp
Starting Member
5 Posts |
Posted - 12/19/2012 : 08:50:21
|
| I simply had the temporary table above as an example. I would prefer to use a function due to performance |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47071 Posts |
Posted - 12/19/2012 : 08:56:52
|
quote: Originally posted by chrissyppppp
I simply had the temporary table above as an example. I would prefer to use a function due to performance
so do you mean you will be calling function for replacing each of the words?
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
 |
|
|
chrissyppppp
Starting Member
5 Posts |
Posted - 12/19/2012 : 09:00:39
|
| Yeah so i'll have a stored proc that will call this function and replace any of the words that need capitilising in instrument name column so that the Instrument names look correct in the data extract i'm developing |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47071 Posts |
Posted - 12/19/2012 : 09:04:46
|
that means multiple invocations of same function and will be performance hit especially when in case of large datasets.Why not populate a table with all values and do a set based replace in one shot?
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
 |
|
|
bandi
Flowing Fount of Yak Knowledge
India
1420 Posts |
Posted - 12/19/2012 : 09:11:43
|
quote: So this works fine, however how can I ensure that i'm replacing what i want to replace. ie in the above example say we wanted all "Co" to be "CO" but also ensuring that "Common" didn't become "COmmon"? Is there any way we could look at the length of the replacement string and that it only replaces words of that length?
Hi simple solution for above problem is: In your first function, do this modification
SELECT @string = REPLACE(@string, ' '+ r.InternalWord + ' ', ' '+r.ExternalWord+ ' ') FROM ReplaceWord
-- Chandu |
 |
|
|
chrissyppppp
Starting Member
5 Posts |
Posted - 12/19/2012 : 10:05:12
|
quote: Originally posted by bandi
quote: So this works fine, however how can I ensure that i'm replacing what i want to replace. ie in the above example say we wanted all "Co" to be "CO" but also ensuring that "Common" didn't become "COmmon"? Is there any way we could look at the length of the replacement string and that it only replaces words of that length?
Hi simple solution for above problem is: In your first function, do this modification
SELECT @string = REPLACE(@string, ' '+ r.InternalWord + ' ', ' '+r.ExternalWord+ ' ') FROM ReplaceWord
-- Chandu
This is what i was thinking as a simple solution, but wanted to see if there was a more robust solution |
 |
|
|
ScottPletcher
Yak Posting Veteran
USA
79 Posts |
Posted - 12/19/2012 : 16:51:05
|
I think using mixed case could just cause headaches for searches.
Even though it takes additional storage, I would materialize a column that was all upper case for doing searches, even if you didn't display that column but instead displayed the mixed-case version of the column. |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47071 Posts |
Posted - 12/19/2012 : 22:07:53
|
case difference would cause search issues only when database collation used is case sensitive. By default sql server collation is case insensitive so case difference doesnt make any issues.
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
 |
|
|
ScottPletcher
Yak Posting Veteran
USA
79 Posts |
Posted - 12/21/2012 : 17:24:39
|
quote: Originally posted by visakh16
case difference would cause search issues only when database collation used is case sensitive. By default sql server collation is case insensitive so case difference doesnt make any issues.
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
Perhaps. But each db, and even each column, can have a specific collation specified Or if the column value is used in XML and the XML is searched, case issues will occur again.
|
 |
|
| |
Topic  |
|
|
|