SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 Old Forums
 CLOSED - General SQL Server
 Row Concatination
 Forum Locked
 Printer Friendly
Author Previous Topic Topic Next Topic  

AskSQLTeam
Ask SQLTeam Question

USA
0 Posts

Posted - 08/05/2002 :  09:30:56  Show Profile  Visit AskSQLTeam's Homepage
Mark writes "I have a 77,000 record table from a legacy (COOL:Gen) that has following structure

RecID RecSeq RecTxt
10 1 A
10 2 A
11 1 A
11 2 B
11 3 C

I need to concatenate it to this

10 AA
11 ABC

My current script based on a Read Only Cursor executing a stored procedure takes 20 min. (A entirely cursor-based script takes 55 min.)

I would like to do something similar to BULK INSERT like this...

EXEC ProcName (@v1, @v2) SELECT v1, v2 FROM Table WHERE Criteria ORDER BY RecSeq

(where the sp checks is rec exists and either inserts or updates RecTxt + @v2)

...but obviously it will not work. Any other idea how I may be able to improve the cursor approach?"

Page47
Flowing Fount of Yak Knowledge

USA
2878 Posts

Posted - 08/05/2002 :  10:00:19  Show Profile
I've posted this method so many times, I think it should be henceforth be referred to as 'The Page47 Method'.... ... admittedly, I think I got if from NR or somebody, so I guess I can't really claim it as my own....


select
RecID,
RecTxt,
space(8000) as RecList
into
#workingtable
from
mark
order by
RecID,
RecSeq

declare
@lastRecID int,
@list varchar(8000)

select
@lastRecID = -1,
@list = ''

update
#workingtable
set
@list = list = case
when @lastRecID <> RecID then RecTxt
else @list + RecTxt
end,
@lastRecID = RecID

select
RecID,
max(RecList)
from
#workingtable

 


Jay White
{0}
Go to Top of Page

Arnold Fribble
Yak-finder General

United Kingdom
1961 Posts

Posted - 08/05/2002 :  11:21:19  Show Profile
So why does this work?
I mean, if there's no TOP and no IDENTITY function, what is an ORDER BY on a SELECT...INTO supposed to be doing?
And how is the order that the UPDATE processes the rows supposed to be the same?
Is there some special logic in SQL Server that spots this construct?


Go to Top of Page

Page47
Flowing Fount of Yak Knowledge

USA
2878 Posts

Posted - 08/05/2002 :  11:28:53  Show Profile
quote:
So why does this work?


I think it works because SQL Server isn't a RDBMS, but rather a SQLDBMS, as Fabian Pascal would say. Here's my best guess.... By ordering the SELECT INTO, the #workingtable records are written to disk (buffer) in an ordered way and the update statement processes them in that order. Now if SQL Server where relational, a) there would be no concept of physical data order on the disk (buffer) and b) the update would not process rows in a particular order. However, that doesn't seem to be how its implemented. I've never been able to not make this work in SQL Server.

With my luck, the next Service Pack fix this ....

Jay White
{0}
Go to Top of Page

robvolk
Most Valuable Yak

USA
15668 Posts

Posted - 08/05/2002 :  11:51:16  Show Profile  Visit robvolk's Homepage
Jay's right. When you think about it, internally, SQL Server would have to DELIBERATELY put them out of order during the INSERT operation if they did not match the ORDER BY clause. That doesn't make a whole lot of sense IMHO.
quote:
Now if SQL Server where relational, a) there would be no concept of physical data order on the disk (buffer)
Ummmm, that's a pretty important "concept" to have nailed down if you ever want to have a WORKING database storing and retrieving real data!

Maybe I should clarify that: The physical order is a byproduct of its physical storage. While its physical order is not important, data that does not "physically" exist is mighty useless.
quote:
and b) the update would not process rows in a particular order.
I can't see how a "truly" relational database would be better if it deliberately stored data out of order, or accessed sequential data pages out of order.
quote:
However, that doesn't seem to be how its implemented.
And might explain why D hasn't been implemented AT ALL.

Go to Top of Page

Arnold Fribble
Yak-finder General

United Kingdom
1961 Posts

Posted - 08/05/2002 :  12:39:25  Show Profile
So does the fact that we're using an atomic INSERT INTO mean that the all the necessary extents for #worktable can be assigned before it is populated with rows? Otherwise, any fragmentation would result in the IAM pages having the extents listed in something other than the intended sort order. And since the IAM pages are the only things referencing all the extents in a heap, the update wouldn't work.


Go to Top of Page

robvolk
Most Valuable Yak

USA
15668 Posts

Posted - 08/05/2002 :  13:30:32  Show Profile  Visit robvolk's Homepage
quote:
So does the fact that we're using an atomic INSERT INTO mean that the all the necessary extents for #worktable can be assigned before it is populated with rows?
Books Online suggests that this indeed is the case:

quote:
Extents are the basic unit in which space is allocated to tables and indexes. An extent is 8 contiguous pages, or 64 KB. This means SQL Server 2000 databases have 16 extents per megabyte.

To make its space allocation efficient, SQL Server 2000 does not allocate entire extents to tables with small amounts of data. SQL Server 2000 has two types of extents:

-Uniform extents are owned by a single object; all eight pages in the extent can only be used by the owning object.

-Mixed extents are shared by up to eight objects.

A new table or index is usually allocated pages from mixed extents. When the table or index grows to the point that it has eight pages, it is switched to uniform extents. If you create an index on an existing table that has enough rows to generate eight pages in the index, all allocations to the index are in uniform extents.
I also remember reading somewhere that a table can only have one mixed extent; it's implied in the above but there was something else that stated it clearly. The worst that would happen is that the first extent of the temp table would be a mixed extent, all other after would be uniform and most likely contiguous.

If you look at tempdb, it makes very little sense to allocate non-contiguous extents unless you're at the limit and all other contiguous extents are in use, and that situation would ONLY occur after a great deal of gymnastics within tempdb. You'd have to see a lot of temp tables created and only certain ones dropped. Sure, this happens a lot anyway, but its so random in nature that it's hard to believe SQL Server would try to squeeze a new table into this available patchwork instead of creating a bunch of new extents.

Go to Top of Page

Arnold Fribble
Yak-finder General

United Kingdom
1961 Posts

Posted - 08/05/2002 :  14:17:18  Show Profile
What I meant was, if it allocates the extents it needs before populating, it doesn't matter if the table has fragmentation because the row inserts will presumably just fill up the extents in IAM order.

I suppose if you're really paranoid, you could instrument the update with some extra counters to check all transitions observered conform to the desired order and complain if they're not. It would only be a small extra scalar cost -- though it could be a bit tiresome if you've got a load of columns in the ORDER BY.


Edited by - Arnold Fribble on 08/05/2002 14:21:42
Go to Top of Page

Wanderer
Flowing Fount of Yak Knowledge

United Kingdom
1167 Posts

Posted - 05/14/2004 :  10:15:08  Show Profile
Hmmm,

I don't know why it is I read these, and everyone else seems to understand, while I go "clear as mud". Am I right in thinking that somehow this works on the unproven, yet always experienced, basis that select from a table that was inserted in order, results in a select being in order ?

If no, then see my "clear as mud" confusion :-)

If yes, the how does the "merry-go-around-scan" potentially impact this ? see http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=35068

From Muddy :-)

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!
Go to Top of Page

Page47
Flowing Fount of Yak Knowledge

USA
2878 Posts

Posted - 05/14/2004 :  11:32:43  Show Profile
The select has an ORDER BY ... it's two different discussions ...

Jay White
{0}
Go to Top of Page

Arnold Fribble
Yak-finder General

United Kingdom
1961 Posts

Posted - 05/14/2004 :  12:55:54  Show Profile
As far as I remember my reasoning was thus:

When you do an UPDATE, you don't get to say what order anything is done in. Usually that doesn't matter, but since this UPDATE is both using and changing a variable, it definitely makes a difference what order those rows are processed in.

I would expect the rows to be processed in the easiest order for the database. If SQL Server says, "Hey! I know what you're trying to do here," and puts an Ordered Forward on the UPDATE's index scan then you're ok. (assuming you wanted the things updated in clustered index order!)

If there's no clustered index, or it doesn't use Ordered Forward, then you're relying on the extents that contain pages for the table being in the desired order. I was expressing some dubiousness that this must be the case. Or something.
Go to Top of Page

Wanderer
Flowing Fount of Yak Knowledge

United Kingdom
1167 Posts

Posted - 05/17/2004 :  03:35:08  Show Profile
Hi Jay,

As Arnold noted, I was thinking along the lines of the Update, and not the select. thinking it through though, since the update will take a exclusive lock, I figure that it cannot "piggy-back" any other users table scan, since it requires no-one else to have locks. Hence I figure that Updates cannot do "merry-go-round scans".

I still would be interested to know for a fact why this works/ if it is guaranteed that it will always work ...

Cheers

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!
Go to Top of Page

byrmol
Shed Building SQL Farmer

Australia
1591 Posts

Posted - 05/17/2004 :  19:24:21  Show Profile
quote:

Now if SQL Server where relational, a) there would be no concept of physical data order on the disk (buffer)



I don't think that's right. RM "says" that order is irrelevant because of the identity principle. It does not care about the physical storage.. CSV, Excel, R-Tree, B-Tree, ordered, unordered.. what ever you like...

quote:

and b) the update would not process rows in a particular order.



True, but we must remember that a UPDATE is simply short hand for a DELETE (MINUS) then an INSERT. (UNION).. The most common distinct alogrithm uses a sort operation and I think it is because SQL (not just SQL Server) allows duplicates that this sort operation is so heavily relied on... Look at projection to see how SQL butchers it and that you must supply the DISTINCT operator to achieve something that looks like a relation. It does raise an interesting question though... If I receive a relation that is unordered, then apply the non-relational operator ORDER BY, is it still a relation after that operation?

quote:

And might explain why D hasn't been implemented AT ALL



It has Rob... The Alphora product uses it...Very well I might add....

DavidM

"Always pre-heat the oven"
Go to Top of Page

todhilton
Starting Member

USA
2 Posts

Posted - 05/12/2005 :  11:08:10  Show Profile  Visit todhilton's Homepage
I'm cross-posting this reply with this topic because my original searches resulted in both posts.

I realize this post is more than a year old, but I just recently had to do this and ended up using the temporary table method posted by Page47. It worked, but was clunky. A few days ago a co-worker showed me a really quick and easy way of doing this in SQL 2000:

declare @list varchar(8000)
select @list = Isnull(@list,'') +';'+ theColumnToBeConcatenated
from yourtable
where yoursearchparameter='something'
select @list

HTH!

~tod

Edited by - todhilton on 05/12/2005 11:19:37
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 05/12/2005 :  11:21:21  Show Profile
Try 3 years this August.....and the actual original question was concat for each key

http://weblogs.sqlteam.com/brettk/archive/2005/02/10/4153.aspx




Brett

8-)
Go to Top of Page

tom_bapps
Starting Member

1 Posts

Posted - 08/03/2005 :  06:39:27  Show Profile
quote:
Originally posted by Page47

I've posted this method so many times, I think it should be henceforth be referred to as 'The Page47 Method'.... ... admittedly, I think I got if from NR or somebody, so I guess I can't really claim it as my own....


select
    RecID,
    RecTxt,
    space(8000) as RecList
into
    #workingtable
from
    mark
order by
    RecID,
    RecSeq

declare
    @lastRecID int,
    @list varchar(8000)

select
    @lastRecID = -1,
    @list = ''

update
    #workingtable
set
    @list = list = case
                       when @lastRecID <> RecID then RecTxt
                       else @list + RecTxt
                   end,
    @lastRecID = RecID

select
    RecID,
    max(RecList)
from
    #workingtable

 
I have created a table called mark as follows I was just trying to use the following above query you had posted,but it throws the error "Invalid column name 'list'."

I have created a table mark as follows

create table mark (RecID int,RecSeq int,RecTxt varchar(3))

What would be the error,Please advice as i am in urgent need of this functionality

Thanking you in advance,
Thomas Alexander

Jay White



Jay White
{0}



If you can you can
if you can't then you are right
Go to Top of Page
  Previous Topic Topic Next Topic  
 Forum Locked
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.12 seconds. Powered By: Snitz Forums 2000