Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Script Library
 Parse a string using CTE.

Author  Topic 

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-05-20 : 12:46:53
This method of parsing does not require a tally table or one of the traditional looping methods.

create function ParseString(@String varchar(500), @Delimiter char(1))
returns @Results table
(Ordinal int,
StringValue varchar(500))
as
begin
--function ParseString
--blindman, 5/20/2008
--Parses a string based upon a given single-character delimiter,
--without using loops or a tally table.

--test parameters
--declare @String varchar(500)
--declare @Delimiter char(1)
--set @String = 'rnbqkbnr/pp1ppppp/8/2p5/4P3/5N2/PPPP1PPP/RNBQKB1R'
--set @Delimiter = '/'

with Results as
(select 1 as Ordinal,
left(@String, charindex(@Delimiter, @String + @Delimiter)-1) as StringValue,
convert(varchar(500), right(@String + @Delimiter, len(@String) - charindex(@Delimiter, @String+@Delimiter) + 1)) as Remaining
UNION ALL
select Ordinal+1,
left(Remaining, charindex(@Delimiter, Remaining)-1),
right(Remaining, len(Remaining) - charindex(@Delimiter, Remaining))
from Results
where len(Remaining) > 0)
insert into @Results
(Ordinal,
StringValue)
select Ordinal,
StringValue
from Results
return
end


e4 d5 xd5 Nf6

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2008-05-20 : 13:02:59
I like the technique - it is similar to one Arnold F. posted in this thread:
best split functions
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648

I'm sure someone has posted this before but I'm too lazy to search for it, do we know performance comparisons for these techniques as well as CLR functions? Is there a clear winner? One post in the other thread seems to suggest that the recursive cte is the best performer but it didn't mention trying CLR.

Be One with the Optimizer
TG
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-05-20 : 13:47:50
Dang. I searched the forum and saw that thread, but did not notice Arnold's post.

e4 d5 xd5 Nf6
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-05-20 : 14:37:58
The difference is that the function in this thread is a multi-line TVF whereas mine was inline. As I said at the time, it makes a big difference to the performance.

Edit: Hmm, perhaps "big" is overstating it somewhat. My test table had 155745 rows of total length 1 to 109 characters, between 1 and 18 pieces and piece lengths between 0 and 14 characters, for a total of 911870 pieces.

I got about 37 seconds for this query with your function as an inline TVF and 70 seconds in its original form as a multi-line TVF:

SELECT I.pk, PS.Ordinal, PS.StringValue
INTO #T
FROM dbo.TestInput AS I
CROSS APPLY dbo.ParseString(I.s, ',') AS PS


Your function passes the remaining part of the string through the recursive CTE whereas mine passes the character indexes. This didn't seem to make much difference to the times.
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-05-21 : 10:29:32
I'd suspect that your inline variation would be cleaner, and so more efficient. When I get a chance I'll switch mine to inline and then test both versions side by side.

e4 d5 xd5 Nf6
Go to Top of Page
   

- Advertisement -