Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Script Library
 How to concatenate records without UDF

Author  Topic 

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-03-27 : 11:45:03
[code]-- Prepare sample data
DECLARE @Sample TABLE (ID INT, Code VARCHAR(3))

INSERT @Sample
SELECT 290780, 'LT' UNION ALL
SELECT 290780, 'AY' UNION ALL
SELECT 290781, 'ILS' UNION ALL
SELECT 290780, 'AY'

-- Show the expected output
SELECT DISTINCT s1.ID,
STUFF((SELECT DISTINCT TOP 100 PERCENT ',' + s2.CODE FROM @Sample AS s2 WHERE s2.ID = s1.ID ORDER BY ',' + s2.CODE FOR XML PATH('')), 1, 1, '') AS CODES
FROM @Sample AS s1
ORDER BY s1.ID

SELECT DISTINCT s1.ID,
STUFF((SELECT TOP 100 PERCENT ',' + s2.CODE FROM @Sample AS s2 WHERE s2.ID = s1.ID ORDER BY ',' + s2.CODE FOR XML PATH('')), 1, 1, '') AS CODES
FROM @Sample AS s1
ORDER BY s1.ID

SELECT DISTINCT s1.ID,
STUFF((SELECT ',' + s2.CODE FROM @Sample AS s2 WHERE s2.ID = s1.ID FOR XML PATH('')), 1, 1, '') AS CODES
FROM @Sample AS s1
ORDER BY s1.ID[/code]

Peter Larsson
Helsingborg, Sweden

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2007-03-27 : 12:59:38
Note that this will work on SQL Server 2005, but not 2000.
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2007-04-13 : 04:29:29
Great STUFF

Did you compare this with 2000's UDF approach?
Which is faster?

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-04-13 : 08:38:53
You tell me. Didn't you just volunteer to do the test and post back the result here? []


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-04-13 : 09:37:09
for xml path is the fastest way to do this yet.

if you have to contencate a lot (1k+) of items then CLR gets the upper hand.

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2007-04-13 : 12:27:45
quote:
Originally posted by Peso

You tell me. Didn't you just volunteer to do the test and post back the result here? []


Peter Larsson
Helsingborg, Sweden

I dont use 2005

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

shedoks
Starting Member

10 Posts

Posted - 2007-09-14 : 10:22:33
Hi.
I know something of sql but this is too complex for me.
Would you be kind to explain this code little bit detailed to me.


Thanks
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-14 : 10:27:58
1) There is a subquery which returns the records I want. The example above has three cases.
2) The subquery is translated into a flat XML string.
3) I remove the first concatenation character.
4) This is returned for every "main" record, so I have to DISTINCT it.

But for your own sake, I think it would be better if you took your time to take the query apart and learn every step.




E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-19 : 05:20:30
"Did you compare this with 2000's UDF approach?
Which is faster?
"

The XML route is relatively slow.

My results are:

Elapsed
-------- -----------
Method 1 280 XML DISTINCT TOP 100 PERCENT
Method 2 266 XML TOP 100 PERCENT
Method 3 233 XML (No order)
Method 4 63 @Variable string concatenation
Method 5a 233 UDF with CASE (Single)
Method 5b 296 UDF with CASE (Multiple)
Method 6a 236 UDF with COALESCE (Single)
Method 6b 296 UDF with COALESCE (Multiple)
Method 7a 233 UDF with IsNull (Single)
Method 7b 280 UDF with IsNull (Multiple)

Test code:

SET NOCOUNT ON

CREATE TABLE dbo.TEMP_SAMPLE
(
[ID] INT,
[Code] VARCHAR(3)
)
GO

INSERT dbo.TEMP_SAMPLE
SELECT 290780, 'LT' UNION ALL
SELECT 290780, 'AY' UNION ALL
SELECT 290781, 'ILS' UNION ALL
SELECT 290780, 'AY'
GO

CREATE FUNCTION dbo.TEMP_ConcatSample_V1(@ID int)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @strOutput VARCHAR(8000)

SET @strOutput = ''
SELECT @strOutput = CASE @strOutput
WHEN '' THEN [Code]
ELSE @strOutput + ',' + [Code]
END
FROM dbo.TEMP_SAMPLE
WHERE [ID] = @ID
ORDER BY [Code]

RETURN @strOutput
END
GO

CREATE FUNCTION dbo.TEMP_ConcatSample_V2(@ID int)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @strOutput VARCHAR(8000)

SELECT @strOutput = COALESCE(@strOutput + ',', '') + [Code]
FROM dbo.TEMP_SAMPLE
WHERE [ID] = @ID
ORDER BY [Code]

RETURN @strOutput
END
GO

CREATE FUNCTION dbo.TEMP_ConcatSample_V3(@ID int)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @strOutput VARCHAR(8000)

SELECT @strOutput = IsNull(@strOutput + ',', '') + [Code]
FROM dbo.TEMP_SAMPLE
WHERE [ID] = @ID
ORDER BY [Code]

RETURN @strOutput
END
GO


DECLARE @dtStart datetime,
@intLoop int,
@strResult varchar(8000)

-- Show the expected output
SELECT @intLoop = 1000
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT DISTINCT s1.ID,
STUFF((SELECT DISTINCT TOP 100 PERCENT ',' + s2.CODE FROM dbo.TEMP_SAMPLE AS s2 WHERE s2.ID = s1.ID ORDER BY ',' + s2.CODE FOR XML PATH('')), 1, 1, '') AS CODES
FROM dbo.TEMP_SAMPLE AS s1
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 1', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())

SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT DISTINCT s1.ID,
STUFF((SELECT TOP 100 PERCENT ',' + s2.CODE FROM dbo.TEMP_SAMPLE AS s2 WHERE s2.ID = s1.ID ORDER BY ',' + s2.CODE FOR XML PATH('')), 1, 1, '') AS CODES
FROM dbo.TEMP_SAMPLE AS s1
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 2', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())

SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT DISTINCT s1.ID,
STUFF((SELECT ',' + s2.CODE FROM dbo.TEMP_SAMPLE AS s2 WHERE s2.ID = s1.ID FOR XML PATH('')), 1, 1, '') AS CODES
FROM dbo.TEMP_SAMPLE AS s1
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 3', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())


SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN

SELECT @strResult = NULL
SELECT @strResult = COALESCE(@strResult + ',', '') + [Code]
FROM
(
SELECT [ID],
[Code]
FROM dbo.TEMP_SAMPLE
WHERE [ID] = 290780
) AS X
ORDER BY [Code]

SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 4', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())


SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT [ID],
[CODES] = dbo.TEMP_ConcatSample_V1([ID])
FROM dbo.TEMP_SAMPLE
WHERE [ID] = 290780
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 5a', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())

SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT [ID],
[CODES] = dbo.TEMP_ConcatSample_V1([ID])
FROM dbo.TEMP_SAMPLE
-- WHERE [ID] = 290780
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 5b', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())


SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT [ID],
[CODES] = dbo.TEMP_ConcatSample_V2([ID])
FROM dbo.TEMP_SAMPLE
WHERE [ID] = 290780
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 6a', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())

SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT [ID],
[CODES] = dbo.TEMP_ConcatSample_V2([ID])
FROM dbo.TEMP_SAMPLE
-- WHERE [ID] = 290780
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 6b', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())


SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT [ID],
[CODES] = dbo.TEMP_ConcatSample_V3([ID])
FROM dbo.TEMP_SAMPLE
WHERE [ID] = 290780
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 7a', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())

SELECT @intLoop = 1000, @strResult = ''
SELECT @dtStart = GetDate()
WHILE @intLoop > 0
BEGIN
SELECT @strResult = ''
SELECT @strResult = @strResult + [CODES]
FROM
(
SELECT [ID],
[CODES] = dbo.TEMP_ConcatSample_V3([ID])
FROM dbo.TEMP_SAMPLE
-- WHERE [ID] = 290780
) AS X
ORDER BY [ID]
SELECT @intLoop = @intLoop - 1
END
SELECT [Result] = @strResult
SELECT 'Method 7b', [Elapsed] = DATEDIFF(Millisecond, @dtStart, GetDate())


GO

DROP TABLE dbo.TEMP_SAMPLE
GO

DROP FUNCTION dbo.TEMP_ConcatSample_V1
GO
DROP FUNCTION dbo.TEMP_ConcatSample_V2
GO
DROP FUNCTION dbo.TEMP_ConcatSample_V3
GO

SET NOCOUNT OFF

Kristen
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-19 : 05:22:54
Cross references, just for completeness:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=56058
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=53885 http://sqljunkies.com/WebLog/amachanic/archive/2004/11/10/5065.aspx?Pending=true

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 05:26:35
It seems UDF and XML approach are equally fast.
@Variable string concatenation is the fastest, but can only do one ID at a time.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 05:27:29
Kristen, which results do you get from all above methods if you increase the sample to a few hundreds?



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-19 : 05:29:21
"It seems UDF and XML approach are equally fast."

Good point, sorry

I was mislead by the superior speed of direct @strVariable.

Seems like a lot of overhead calling a function ...

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 05:33:12
And what about version 4 of the function?
CREATE FUNCTION dbo.TEMP_ConcatSample_V4(@ID int)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @strOutput VARCHAR(8000)

SET @strOutput = ''

SELECT @strOutput = @strOutput + ',' + [Code]
FROM dbo.TEMP_SAMPLE
WHERE [ID] = @ID
ORDER BY [Code]

RETURN STUFF(@strOutput, 1, 1, '')
END



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 05:59:35
I am testing right now. I use same sample data but insert so that there are 1024 records in the table and change @intloop to 2 instead of 1000.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 06:02:40
Test is the test data I get (1,024 records and @intloop = 2)
Method		Elapsed
-------- -------
Method 1 63
Method 2 546
Method 3 530
Method 4 16
Method 5a 6763
Method 5b 7596
Method 6a 6623
Method 6b 6733
Method 7a 5266
Method 7b 5860
Method 7c 5030



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 06:09:29
Test is the test data I get (1,024 records and @intloop = 20)
Method		Elapsed
--------- -------
Method 1 186
Method 2 5640
Method 3 5453
Method 4 123
Method 5a 60220
Method 5b 64530
Method 6a 49096
Method 6b 53436
Method 7a 43766
Method 7b 48500
Method 7c 43360



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 06:15:46
And now I test again with a CLUSTERED INDEX over (ID, Code).

This is the test data I get (1,024 records and @intloop = 2)
Method		Elapsed
--------- -------
Method 1 30
Method 2 546
Method 3 546
Method 4 16
Method 5a 3733
Method 5b 3656
Method 6a 2343
Method 6b 2406
Method 7a 1673
Method 7b 1843
Method 7c 1763

This is the test data I get (1,024 records and @intloop = 20)
Method		Elapsed
--------- -------
Method 1 143
Method 2 5406
Method 3 5233
Method 4 46
Method 5a 33173
Method 5b 35670
Method 6a 22860
Method 6b 24843
Method 7a 16970
Method 7b 18546
Method 7c 16673
XML approach did not gain much with CI, but is still faster!


E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-19 : 07:39:30
"RETURN STUFF(@strOutput, 1, 1, '')"

Interesting. Do you think that is that preferable to

RETURN SubString(@strOutput, 2, 8000)

??

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-09-19 : 08:03:23
The very same
And SUBSTRING is probably faster too...


E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-10-25 : 10:00:18
For more "formatting" stuff in the outout...
-- Prepare sample data
DECLARE @Sample TABLE (ID INT, Name VARCHAR(9))

INSERT @Sample
SELECT 3, 'Car' UNION ALL
SELECT 1, 'Lion' UNION ALL
SELECT 1, 'Rat' UNION ALL
SELECT 1, 'Mouse' UNION ALL
SELECT 2, 'Apple' UNION ALL
SELECT 2, 'Orange'

-- Show the expected output
SELECT DISTINCT s1.ID,
STUFF(
(
SELECT CASE
WHEN Item = 1 AND Items > 1 THEN ' and '
ELSE ', '
END + s2.Name
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY s2.ID ORDER BY s2.Name DESC) AS Item,
COUNT(*) OVER (PARTITION BY s2.ID) AS Items,
s2.Name
FROM @Sample AS s2
WHERE s2.ID = s1.ID
) AS s2
ORDER BY Item DESC
FOR XML PATH('')
), 1, 2, '') AS Items
FROM @Sample AS s1
ORDER BY s1.ID



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page
    Next Page

- Advertisement -