Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Cursor troubles

Author  Topic 

tommynz
Starting Member

9 Posts

Posted - 2004-08-16 : 06:21:42
Hi

I am having trouble with a cursor operation. If someone could help out or suggest a way to not use a cursor I would be very grateful.

I have a table containing 144 million records (SQL Server 7). I have added 5 new columns to this table. The new columns must be populated with data contained in an existing column. This is a one off operation and will not be performed again.

The current code I have written (below) uses a cursor and works fine up until about 10 million records. It starts off doing 1 million records every 2 minutes up until 10 million. The next 2 million take 20 minutes each, and the next million about 40 minutes. It then just seems to stop processing (after 13 million records). There are no errors and the query does not stop running(I am running it from Query Analyzer), there just does not seem to be any more progress (am using print after each million so I know where it is up to). Speed is not a major issue - I just need it to run right through.

The code below uses a stored procedure (sp_DecodeIncaData) that takes a string value (the data from the existing column) and returns the 5 values to be placed in the 5 new columns.

I also dump the transaction log after every million updates to avoid this growing obcenely large.

Many thanks
Tommy



SET NOCOUNT ON

DECLARE @incadata varchar(4000),
@ShoulderNo varchar(255),
@IncidentNo int,
@Callsign varchar(7),
@OCUID int,
@WarrantNo varchar(8),
@Counter int

DECLARE CoAVLSLog_Cursor CURSOR FOR
SELECT incadata
FROM CoAvlsLog

OPEN CoAVLSLog_Cursor

FETCH NEXT FROM CoAVLSLog_Cursor
INTO @incadata

SET @Counter = 1

WHILE @@FETCH_STATUS = 0
BEGIN
EXECUTE sp_DecodeIncaData @IncaData,
@ShoulderNo OUTPUT,
@IncidentNo OUTPUT,
@Callsign OUTPUT,
@OCUID OUTPUT,
@WarrantNo OUTPUT

UPDATE CoAVLSLog SET [Shoulder No] = @ShoulderNo,
Incident = @IncidentNo,
Callsign = @Callsign,
OCUID = @OCUID,
[Warrant No] = @WarrantNo
WHERE CURRENT OF CoAVLSLog_Cursor

SET @Counter = @Counter + 1
IF @Counter % 1000000 = 0
BEGIN
DUMP TRANSACTION IMDNLog WITH NO_LOG
PRINT CONVERT(VARCHAR(255), @Counter) + ' records processed - dumping transaction log'
END

FETCH NEXT FROM CoAVLSLog_Cursor
INTO @incadata
END

CLOSE CoAVLSLog_Cursor
DEALLOCATE CoAVLSLog_Cursor

PRINT @Counter

SET NOCOUNT OFF

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2004-08-16 : 07:58:03
it looks to me you simply want to move data from one col to another in the same row.
and why don't you do simply:

UPDATE CoAVLSLog
SET [Shoulder No] = OldColumn1
Incident = OldColumn2,
Callsign = OldColumn3,
OCUID = OldColumn4

or am i missing something here???

Go with the flow & have fun! Else fight the flow :)
Go to Top of Page

tommynz
Starting Member

9 Posts

Posted - 2004-08-16 : 08:11:02
There is only one old column. It is a "raw" dump of the data I now want to use populate the 5 new columns.

The old column (IncaData) data looks like 'VVVV|WWWW|XXXX|YYYY|ZZZZ'

where
VVVV is the data I want to put in the new ShoulderNo column
WWWW is the data I want to put in the new IncidentNo column
XXXX is the data I want to put in the new Callsign column
YYYY is the data I want to put in the new OCUID column
ZZZZ is the data I want to put in the new WarrantNo column

As I mentioned in my original post, the stored procedure sp_DecodeIncaData (used inside the cursor) takes the string from the IncaData column and returns the 5 new strings that will be placed in the new columns.

Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-08-16 : 08:52:25
Oh i know! Try this and see if it works for ya:


Declare @myTable table (myCol varchar(50))

Insert Into @myTable Values ('AAA1|BBB1|CCCC1|DDD1|EEEE1')
Insert Into @myTable Values ('AAA2|BBB2|CCCC2|DDD2|EEEE2')
Insert Into @myTable Values ('AAA3|BBB3|CCCC3|DDD3|EEEE3')
Insert Into @myTable Values ('AAA4|BBB4|CCCC4|DDD4|EEEE4')
Insert Into @myTable Values ('AAA5|BBB5|CCCC5|DDD5|EEEE5')
Insert Into @myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6')
Insert Into @myTable Values ('AAA7|BBB7|CCCC7|DDD7|EEEE7')
Insert Into @myTable Values ('AAA8|BBB8|CCCC8|DDD8|EEEE8')

Select Col1, Col2, COl3, Col4 = left(myCol,charindex('|',myCol)-1), col5 = Right(myCol,len(myCol)-charindex('|',myCol))
From
(
Select Col1, Col2, Col3 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol))
From
(
Select Col1, Col2 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol))
From
(
Select Col1 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From @myTable
) as A
) as B
) as C


Corey
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-08-16 : 09:06:20
Cursors are typically very bad for very large rowsets. The solution to your problem will most likely be: Don't use a cursor.

The query I have supplied you is an example of how to replace your cursor with a set based method, that should not only be significantly faster, it most likely will finish the entire set.

Good luck!

Corey
Go to Top of Page

samsekar
Constraint Violating Yak Guru

437 Posts

Posted - 2004-08-16 : 09:22:47
USE BCP to do this.. BCP is much faster as it is non logged transaction.. Create a sperate table and bcp in the whole data then rename it.. I think it would be a better way. All the best.

HTH.

Create table tempdb.dbo.myTable (myCol varchar(50))

Insert Into tempdb.dbo.myTable Values ('AAA1|BBB1|CCCC1|DDD1|EEEE1')
Insert Into tempdb.dbo.myTable Values ('AAA2|BBB2|CCCC2|DDD2|EEEE2')
Insert Into tempdb.dbo.myTable Values ('AAA3|BBB3|CCCC3|DDD3|EEEE3')
Insert Into tempdb.dbo.myTable Values ('AAA4|BBB4|CCCC4|DDD4|EEEE4')
Insert Into tempdb.dbo.myTable Values ('AAA5|BBB5|CCCC5|DDD5|EEEE5')
Insert Into tempdb.dbo.myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6')
Insert Into tempdb.dbo.myTable Values ('AAA7|BBB7|CCCC7|DDD7|EEEE7')
Insert Into tempdb.dbo.myTable Values ('AAA8|BBB8|CCCC8|DDD8|EEEE8')

Select * from tempdb.dbo.myTable

drop table tempdb.dbo.test
create table tempdb.dbo.test
(ShoulderNo varchar(10), IncidentNo varchar(10),
Callsign varchar(10), OCUID varchar(10), WarrantNo varchar(10) )

master.dbo.xp_cmdshell 'bcp "Select myCol from tempdb.dbo.myTable" queryout C:\temp\temp.txt -T -S -c'
master.dbo.xp_cmdshell 'type c:\temp\temp.txt'

BULK INSERT tempdb.dbo.test FROM 'c:\temp\temp.txt'
WITH (DATAFILETYPE = 'char', FIELDTERMINATOR = '|', ROWTERMINATOR = '\n')

- Sekar
Go to Top of Page

tommynz
Starting Member

9 Posts

Posted - 2004-08-16 : 09:23:22
Thank you Seventhnight.

I can see how this works as a select statement, but can you give me a few pointers on how to convert the select statement in your example to an update statement.

ie the table structure would be

Declare @myTable table (myCol varchar(50), NewCol1 varchar(10), NewCol2 varchar(10), NewCol3 varchar(10), NewCol4 varchar(10), NewCol5 varchar(10))

and the data in mycol would be used to populate the 5 new columns.

Many thanks
Michael
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-08-16 : 09:41:49
quote:

The old column (IncaData) data looks like 'VVVV|WWWW|XXXX|YYYY|ZZZZ'

where
VVVV is the data I want to put in the new ShoulderNo column
WWWW is the data I want to put in the new IncidentNo column
XXXX is the data I want to put in the new Callsign column
YYYY is the data I want to put in the new OCUID column
ZZZZ is the data I want to put in the new WarrantNo column



if that format is a hard and fast rule (there will always be 4 characters per letter), then the update statement is easy:


UDPDATE YourTAble
SET ShoulderNo = Left(IncaData,4),
IncidentNo = Substring(IncaData,6,4),
Callsign = Substring(IncaData,11,4),
OCUID = SubString(IncaData,16,4),
WarrantNo = SubString(IncaData,21,4)


should be easy. Do you see why this would work? are you familiar with how SUBSTRING() and LEFT() work? Read up on books on-line if you need more help; these are basic T-SQL functions you should be familiar with.

PLEASE don't use BCP or a cursor for this! this is a classic example of what update statements are designed for. I would expect an UPDATE statement would run about 100 times faster than using a cursor; maybe more.

- Jeff
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-08-16 : 09:45:40
also: if the size of the update causes problems, run it in batches like this:



SET ROWCOUNT 100000

WHILE EXISTS (Select * from YOurTable where ShoulderNo is null)
UDPDATE YourTAble
SET ShoulderNo = Left(IncaData,4),
IncidentNo = Substring(IncaData,6,4),
Callsign = Substring(IncaData,11,4),
OCUID = SubString(IncaData,16,4),
WarrantNo = SubString(IncaData,21,4)
WHERE
ShoulderNo is null

SET ROWCOUNT 0


set the rowcount to a manageable size .. this will allow you to update a few hundred thousand rows at a time if the 144 million rowcount causes a problem (which it might -- that's a LOT of data)

- Jeff
Go to Top of Page

tommynz
Starting Member

9 Posts

Posted - 2004-08-16 : 09:47:23
Thanks Jeff but the data is not always the same length.

That was only an example and I did not make that clear, sorry.

Seventhnight has provided a SELECT statement that looks promising, but I do not know how to go about changing it to an UPDATE statement.

Thanks
Michael

Go to Top of Page

tommynz
Starting Member

9 Posts

Posted - 2004-08-16 : 09:56:48
Jeff,

If I was to use

SET ROWCOUNT 100000

and process only 100000 records, how do I then continue processing from that point forward?

Thanks
Michael
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-08-16 : 10:08:53
quote:
... and process only 100000 records, how do I then continue processing from that point forward?



notice the WHILE statement before the update -- that causes a loop. it will loop until no rows are left to update.

To handle the case where the position of the pipe | characters changes, you can do it like this:



-- this is a sample table w/ sample data:
Create table myTable (myCol varchar(50), Col1 varchar(10), Col2 varchar(10),
col3 varchar(10), col4 varchar(10), col5 varchar(10))

go

Insert Into myTable Values ('AA1|BBB1|CCCC1|DDD1|EEEE1', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA2|BBB2|CC2|DDD2|EEEE2', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAAAA3|BBB3|CCCC3|DDD3|EEEE3', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA4|BBB4|CCCC4|DD4|EEEE4', Null,Null,Null,Null,Null)
Insert Into myTable Values ('A5|BBB5|CCCC5|DDD5|E5', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA7|BBB7|CCCCC7|DDD7|EEEE7', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AA8|BBBBBBB8|CCCC8|DDD8|EEEE8', Null,Null,Null,Null,Null)

-- you will need to declare these variables:

declare @i1 int
declare @i2 int
declare @i3 int
declare @i4 int

-- and here is your update statement:

update myTable set
@i1 = charindex('|', myCol),
col1 = left(myCol, @i1-1),
@i2 = charindex('|',myCol,@i1+1),
col2 = substring(myCol, @i1+1, @i2-@i1-1),
@i3 = charindex('|',myCol, @i2+1),
col3 = substring(myCol, @i2+1, @i3-@i2-1),
@i4 = charindex('|',myCol, @i3+1),
col4 = substring(myCol, @i3+1, @i4-@i3-1),
col5 = substring(myCol, @i4+1, 50)

select * from myTable


it is easy to do this operation in an update as opposed to a SELECT, because in an update you can set variables along the way which makes it pretty easy. In a SELECT, as corey demonstrates, the easiest way is to use nested subqueries.

anyway, run the sample, make sure it makes sense, and try it out.

- Jeff
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-08-16 : 10:09:24
The while exists keeps it going:


this is my example expanded:


Declare @myTable table (id int identity(1,1) Not NUll, myCol varchar(50), Col1 varchar(10), Col2 varchar(10), Col3 varchar(10), Col4 varchar(10), Col5 varchar(10))

Insert Into @myTable Values ('AAA1|BBB1|CCCC1|DDD1|EEEE1',null,null,null,null,null)
Insert Into @myTable Values ('AAA2|BBB2|CCCC2|DDD2|EEEE2',null,null,null,null,null)
Insert Into @myTable Values ('AAA3|BBB3|CCCC3|DDD3|EEEE3',null,null,null,null,null)
Insert Into @myTable Values ('AAA4|BBB4|CCCC4|DDD4|EEEE4',null,null,null,null,null)
Insert Into @myTable Values ('AAA5|BBB5|CCCC5|DDD5|EEEE5',null,null,null,null,null)
Insert Into @myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6',null,null,null,null,null)
Insert Into @myTable Values ('AAA7|BBB7|CCCC7|DDD7|EEEE7',null,null,null,null,null)
Insert Into @myTable Values ('AAA8|BBB8|CCCC8|DDD8|EEEE8',null,null,null,null,null)


Set RowCount 1

While Exists(Select 1 From @myTable Where Col1 is null)
Begin
Update Z
Set
Col1 = Y.Col1,
Col2 = Y.Col2,
Col3 = Y.Col3,
Col4 = Y.Col4,
Col5 = Y.Col5
From @myTable as Z
Inner Join
(
Select id, Col1, Col2, COl3, Col4 = left(myCol,charindex('|',myCol)-1), col5 = Right(myCol,len(myCol)-charindex('|',myCol))
From
(
Select id, Col1, Col2, Col3 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol))
From
(
Select id, Col1, Col2 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol))
From
(
Select id, Col1 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From @myTable
) as A
) as B
) as C
) as Y
On Z.id = Y.id
Where Z.Col1 is null
End
Set RowCount 0

Select * From @myTable



You may not have an id field, so use whatever is your primary key in place of id.


Corey
Go to Top of Page

samsekar
Constraint Violating Yak Guru

437 Posts

Posted - 2004-08-16 : 10:10:12
quote:
Originally posted by jsmith8858
PLEASE don't use BCP or a cursor for this! this is a classic example of what update statements are designed for. I would expect an UPDATE statement would run about 100 times faster than using a cursor; maybe more.

- Jeff



Hi Jeff, AFAIK BCP is faster for doing large operations! Can you please be so kind to explain why you suggest update instead of BCP in this case.

- Sekar
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-08-16 : 10:10:41
don't do it by setting the rowcount to 1 and looping -- update a good number of rows at a time. if you set the rowcount to 1 and loop, then you might as well use a cursor.

- Jeff
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-08-16 : 10:12:05
Now that is just freaking cool... i did not know you could set variables like that in an update. *sigh*

quote:
Originally posted by jsmith8858

quote:
... and process only 100000 records, how do I then continue processing from that point forward?



notice the WHILE statement before the update -- that causes a loop. it will loop until no rows are left to update.

To handle the case where the position of the pipe | characters changes, you can do it like this:



-- this is a sample table w/ sample data:
Create table myTable (myCol varchar(50), Col1 varchar(10), Col2 varchar(10),
col3 varchar(10), col4 varchar(10), col5 varchar(10))

go

Insert Into myTable Values ('AA1|BBB1|CCCC1|DDD1|EEEE1', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA2|BBB2|CC2|DDD2|EEEE2', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAAAA3|BBB3|CCCC3|DDD3|EEEE3', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA4|BBB4|CCCC4|DD4|EEEE4', Null,Null,Null,Null,Null)
Insert Into myTable Values ('A5|BBB5|CCCC5|DDD5|E5', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AAA7|BBB7|CCCCC7|DDD7|EEEE7', Null,Null,Null,Null,Null)
Insert Into myTable Values ('AA8|BBBBBBB8|CCCC8|DDD8|EEEE8', Null,Null,Null,Null,Null)

-- you will need to declare these variables:

declare @i1 int
declare @i2 int
declare @i3 int
declare @i4 int

-- and here is your update statement:

update myTable set
@i1 = charindex('|', myCol),
col1 = left(myCol, @i1-1),
@i2 = charindex('|',myCol,@i1+1),
col2 = substring(myCol, @i1+1, @i2-@i1-1),
@i3 = charindex('|',myCol, @i2+1),
col3 = substring(myCol, @i2+1, @i3-@i2-1),
@i4 = charindex('|',myCol, @i3+1),
col4 = substring(myCol, @i3+1, @i4-@i3-1),
col5 = substring(myCol, @i4+1, 50)

select * from myTable


it is easy to do this operation in an update as opposed to a SELECT, because in an update you can set variables along the way which makes it pretty easy. In a SELECT, as corey demonstrates, the easiest way is to use nested subqueries.

anyway, run the sample, make sure it makes sense, and try it out.

- Jeff



Corey
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-08-16 : 10:13:52
I was only showing that it would loop through his set, but my set was only 8... I thought it was clear enough from earlier conversation that Rowcount should be 100000 or some such...

quote:
Originally posted by jsmith8858

don't do it by setting the rowcount to 1 and looping -- update a good number of rows at a time. if you set the rowcount to 1 and loop, then you might as well use a cursor.

- Jeff



Corey
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-08-16 : 10:14:26
quote:
Originally posted by samsekar

quote:
Originally posted by jsmith8858
PLEASE don't use BCP or a cursor for this! this is a classic example of what update statements are designed for. I would expect an UPDATE statement would run about 100 times faster than using a cursor; maybe more.

- Jeff



Hi Jeff, AFAIK BCP is faster for doing large operations! Can you please be so kind to explain why you suggest update instead of BCP in this case.

- Sekar



sure!

the goal is he has table X which in which he wants to update 5 columns. he doesn't want to create a new table, he wants to update the data in 5 columns.

option 1: update table X directly

option 2: BCP out the table with the new columns generated, then BCP the table into the database into a temp table, and then update the table X joining to the temp table.

Option 2 includes the same step that exists in option 1, plus the added steps of exporting and then importing 144 million rows. Or am I missing something?

- Jeff
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-08-16 : 10:16:21
it is pretty cool -- one the many things i learned here at SQLTeam. I wish you could do it in a standard SELECT, that would be REALLY cool. you can only do it in an UPDATE statement.

- Jeff
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2004-08-16 : 10:17:36
quote:
Originally posted by Seventhnight

I was only showing that it would loop through his set, but my set was only 8... I thought it was clear enough from earlier conversation that Rowcount should be 100000 or some such...

quote:
Originally posted by jsmith8858

don't do it by setting the rowcount to 1 and looping -- update a good number of rows at a time. if you set the rowcount to 1 and loop, then you might as well use a cursor.

- Jeff



Corey



just making sure !

- Jeff
Go to Top of Page

tommynz
Starting Member

9 Posts

Posted - 2004-08-16 : 10:21:54
Wow, replys coming thick and fast. Having trouble trying to keep up.

Thanks a lot for all your help guys.

I have tried Jeffs update statement on a small set of data and it looks good.

I will try it on the real table and let you know how it goes.

Again thank you very much

Regards
Michael

Go to Top of Page
    Next Page

- Advertisement -