Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
tommynz
Starting Member
9 Posts |
Posted - 2004-08-16 : 06:21:42
|
| HiI am having trouble with a cursor operation. If someone could help out or suggest a way to not use a cursor I would be very grateful.I have a table containing 144 million records (SQL Server 7). I have added 5 new columns to this table. The new columns must be populated with data contained in an existing column. This is a one off operation and will not be performed again. The current code I have written (below) uses a cursor and works fine up until about 10 million records. It starts off doing 1 million records every 2 minutes up until 10 million. The next 2 million take 20 minutes each, and the next million about 40 minutes. It then just seems to stop processing (after 13 million records). There are no errors and the query does not stop running(I am running it from Query Analyzer), there just does not seem to be any more progress (am using print after each million so I know where it is up to). Speed is not a major issue - I just need it to run right through. The code below uses a stored procedure (sp_DecodeIncaData) that takes a string value (the data from the existing column) and returns the 5 values to be placed in the 5 new columns. I also dump the transaction log after every million updates to avoid this growing obcenely large.Many thanksTommySET NOCOUNT ONDECLARE @incadata varchar(4000), @ShoulderNo varchar(255), @IncidentNo int, @Callsign varchar(7), @OCUID int, @WarrantNo varchar(8), @Counter intDECLARE CoAVLSLog_Cursor CURSOR FORSELECT incadataFROM CoAvlsLogOPEN CoAVLSLog_Cursor FETCH NEXT FROM CoAVLSLog_Cursor INTO @incadataSET @Counter = 1WHILE @@FETCH_STATUS = 0BEGIN EXECUTE sp_DecodeIncaData @IncaData, @ShoulderNo OUTPUT, @IncidentNo OUTPUT, @Callsign OUTPUT, @OCUID OUTPUT, @WarrantNo OUTPUT UPDATE CoAVLSLog SET [Shoulder No] = @ShoulderNo, Incident = @IncidentNo, Callsign = @Callsign, OCUID = @OCUID, [Warrant No] = @WarrantNo WHERE CURRENT OF CoAVLSLog_Cursor SET @Counter = @Counter + 1 IF @Counter % 1000000 = 0 BEGIN DUMP TRANSACTION IMDNLog WITH NO_LOG PRINT CONVERT(VARCHAR(255), @Counter) + ' records processed - dumping transaction log' END FETCH NEXT FROM CoAVLSLog_Cursor INTO @incadataENDCLOSE CoAVLSLog_CursorDEALLOCATE CoAVLSLog_CursorPRINT @CounterSET NOCOUNT OFF |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2004-08-16 : 07:58:03
|
| it looks to me you simply want to move data from one col to another in the same row.and why don't you do simply:UPDATE CoAVLSLog SET [Shoulder No] = OldColumn1Incident = OldColumn2,Callsign = OldColumn3,OCUID = OldColumn4or am i missing something here???Go with the flow & have fun! Else fight the flow :) |
 |
|
|
tommynz
Starting Member
9 Posts |
Posted - 2004-08-16 : 08:11:02
|
| There is only one old column. It is a "raw" dump of the data I now want to use populate the 5 new columns.The old column (IncaData) data looks like 'VVVV|WWWW|XXXX|YYYY|ZZZZ'whereVVVV is the data I want to put in the new ShoulderNo columnWWWW is the data I want to put in the new IncidentNo columnXXXX is the data I want to put in the new Callsign columnYYYY is the data I want to put in the new OCUID columnZZZZ is the data I want to put in the new WarrantNo columnAs I mentioned in my original post, the stored procedure sp_DecodeIncaData (used inside the cursor) takes the string from the IncaData column and returns the 5 new strings that will be placed in the new columns. |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2004-08-16 : 08:52:25
|
Oh i know! Try this and see if it works for ya:Declare @myTable table (myCol varchar(50))Insert Into @myTable Values ('AAA1|BBB1|CCCC1|DDD1|EEEE1')Insert Into @myTable Values ('AAA2|BBB2|CCCC2|DDD2|EEEE2')Insert Into @myTable Values ('AAA3|BBB3|CCCC3|DDD3|EEEE3')Insert Into @myTable Values ('AAA4|BBB4|CCCC4|DDD4|EEEE4')Insert Into @myTable Values ('AAA5|BBB5|CCCC5|DDD5|EEEE5')Insert Into @myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6')Insert Into @myTable Values ('AAA7|BBB7|CCCC7|DDD7|EEEE7')Insert Into @myTable Values ('AAA8|BBB8|CCCC8|DDD8|EEEE8')Select Col1, Col2, COl3, Col4 = left(myCol,charindex('|',myCol)-1), col5 = Right(myCol,len(myCol)-charindex('|',myCol)) From ( Select Col1, Col2, Col3 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From ( Select Col1, Col2 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From ( Select Col1 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From @myTable ) as A ) as B ) as CCorey |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2004-08-16 : 09:06:20
|
| Cursors are typically very bad for very large rowsets. The solution to your problem will most likely be: Don't use a cursor.The query I have supplied you is an example of how to replace your cursor with a set based method, that should not only be significantly faster, it most likely will finish the entire set.Good luck!Corey |
 |
|
|
samsekar
Constraint Violating Yak Guru
437 Posts |
Posted - 2004-08-16 : 09:22:47
|
| USE BCP to do this.. BCP is much faster as it is non logged transaction.. Create a sperate table and bcp in the whole data then rename it.. I think it would be a better way. All the best.HTH.Create table tempdb.dbo.myTable (myCol varchar(50))Insert Into tempdb.dbo.myTable Values ('AAA1|BBB1|CCCC1|DDD1|EEEE1')Insert Into tempdb.dbo.myTable Values ('AAA2|BBB2|CCCC2|DDD2|EEEE2')Insert Into tempdb.dbo.myTable Values ('AAA3|BBB3|CCCC3|DDD3|EEEE3')Insert Into tempdb.dbo.myTable Values ('AAA4|BBB4|CCCC4|DDD4|EEEE4')Insert Into tempdb.dbo.myTable Values ('AAA5|BBB5|CCCC5|DDD5|EEEE5')Insert Into tempdb.dbo.myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6')Insert Into tempdb.dbo.myTable Values ('AAA7|BBB7|CCCC7|DDD7|EEEE7')Insert Into tempdb.dbo.myTable Values ('AAA8|BBB8|CCCC8|DDD8|EEEE8')Select * from tempdb.dbo.myTable drop table tempdb.dbo.testcreate table tempdb.dbo.test (ShoulderNo varchar(10), IncidentNo varchar(10), Callsign varchar(10), OCUID varchar(10), WarrantNo varchar(10) )master.dbo.xp_cmdshell 'bcp "Select myCol from tempdb.dbo.myTable" queryout C:\temp\temp.txt -T -S -c'master.dbo.xp_cmdshell 'type c:\temp\temp.txt'BULK INSERT tempdb.dbo.test FROM 'c:\temp\temp.txt' WITH (DATAFILETYPE = 'char', FIELDTERMINATOR = '|', ROWTERMINATOR = '\n')- Sekar |
 |
|
|
tommynz
Starting Member
9 Posts |
Posted - 2004-08-16 : 09:23:22
|
| Thank you Seventhnight. I can see how this works as a select statement, but can you give me a few pointers on how to convert the select statement in your example to an update statement.ie the table structure would beDeclare @myTable table (myCol varchar(50), NewCol1 varchar(10), NewCol2 varchar(10), NewCol3 varchar(10), NewCol4 varchar(10), NewCol5 varchar(10))and the data in mycol would be used to populate the 5 new columns.Many thanksMichael |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-08-16 : 09:41:49
|
quote: The old column (IncaData) data looks like 'VVVV|WWWW|XXXX|YYYY|ZZZZ'whereVVVV is the data I want to put in the new ShoulderNo columnWWWW is the data I want to put in the new IncidentNo columnXXXX is the data I want to put in the new Callsign columnYYYY is the data I want to put in the new OCUID columnZZZZ is the data I want to put in the new WarrantNo column
if that format is a hard and fast rule (there will always be 4 characters per letter), then the update statement is easy:UDPDATE YourTAbleSET ShoulderNo = Left(IncaData,4), IncidentNo = Substring(IncaData,6,4), Callsign = Substring(IncaData,11,4), OCUID = SubString(IncaData,16,4), WarrantNo = SubString(IncaData,21,4) should be easy. Do you see why this would work? are you familiar with how SUBSTRING() and LEFT() work? Read up on books on-line if you need more help; these are basic T-SQL functions you should be familiar with.PLEASE don't use BCP or a cursor for this! this is a classic example of what update statements are designed for. I would expect an UPDATE statement would run about 100 times faster than using a cursor; maybe more.- Jeff |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-08-16 : 09:45:40
|
also: if the size of the update causes problems, run it in batches like this:SET ROWCOUNT 100000WHILE EXISTS (Select * from YOurTable where ShoulderNo is null) UDPDATE YourTAble SET ShoulderNo = Left(IncaData,4), IncidentNo = Substring(IncaData,6,4), Callsign = Substring(IncaData,11,4), OCUID = SubString(IncaData,16,4), WarrantNo = SubString(IncaData,21,4) WHERE ShoulderNo is nullSET ROWCOUNT 0 set the rowcount to a manageable size .. this will allow you to update a few hundred thousand rows at a time if the 144 million rowcount causes a problem (which it might -- that's a LOT of data)- Jeff |
 |
|
|
tommynz
Starting Member
9 Posts |
Posted - 2004-08-16 : 09:47:23
|
| Thanks Jeff but the data is not always the same length. That was only an example and I did not make that clear, sorry. Seventhnight has provided a SELECT statement that looks promising, but I do not know how to go about changing it to an UPDATE statement.Thanks Michael |
 |
|
|
tommynz
Starting Member
9 Posts |
Posted - 2004-08-16 : 09:56:48
|
| Jeff,If I was to use SET ROWCOUNT 100000and process only 100000 records, how do I then continue processing from that point forward?ThanksMichael |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-08-16 : 10:08:53
|
quote: ... and process only 100000 records, how do I then continue processing from that point forward?
notice the WHILE statement before the update -- that causes a loop. it will loop until no rows are left to update.To handle the case where the position of the pipe | characters changes, you can do it like this:-- this is a sample table w/ sample data:Create table myTable (myCol varchar(50), Col1 varchar(10), Col2 varchar(10), col3 varchar(10), col4 varchar(10), col5 varchar(10))goInsert Into myTable Values ('AA1|BBB1|CCCC1|DDD1|EEEE1', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA2|BBB2|CC2|DDD2|EEEE2', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAAAA3|BBB3|CCCC3|DDD3|EEEE3', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA4|BBB4|CCCC4|DD4|EEEE4', Null,Null,Null,Null,Null)Insert Into myTable Values ('A5|BBB5|CCCC5|DDD5|E5', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA7|BBB7|CCCCC7|DDD7|EEEE7', Null,Null,Null,Null,Null)Insert Into myTable Values ('AA8|BBBBBBB8|CCCC8|DDD8|EEEE8', Null,Null,Null,Null,Null)-- you will need to declare these variables:declare @i1 intdeclare @i2 intdeclare @i3 intdeclare @i4 int-- and here is your update statement:update myTable set @i1 = charindex('|', myCol), col1 = left(myCol, @i1-1), @i2 = charindex('|',myCol,@i1+1), col2 = substring(myCol, @i1+1, @i2-@i1-1), @i3 = charindex('|',myCol, @i2+1), col3 = substring(myCol, @i2+1, @i3-@i2-1), @i4 = charindex('|',myCol, @i3+1), col4 = substring(myCol, @i3+1, @i4-@i3-1), col5 = substring(myCol, @i4+1, 50)select * from myTableit is easy to do this operation in an update as opposed to a SELECT, because in an update you can set variables along the way which makes it pretty easy. In a SELECT, as corey demonstrates, the easiest way is to use nested subqueries.anyway, run the sample, make sure it makes sense, and try it out.- Jeff |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2004-08-16 : 10:09:24
|
The while exists keeps it going:this is my example expanded:Declare @myTable table (id int identity(1,1) Not NUll, myCol varchar(50), Col1 varchar(10), Col2 varchar(10), Col3 varchar(10), Col4 varchar(10), Col5 varchar(10))Insert Into @myTable Values ('AAA1|BBB1|CCCC1|DDD1|EEEE1',null,null,null,null,null)Insert Into @myTable Values ('AAA2|BBB2|CCCC2|DDD2|EEEE2',null,null,null,null,null)Insert Into @myTable Values ('AAA3|BBB3|CCCC3|DDD3|EEEE3',null,null,null,null,null)Insert Into @myTable Values ('AAA4|BBB4|CCCC4|DDD4|EEEE4',null,null,null,null,null)Insert Into @myTable Values ('AAA5|BBB5|CCCC5|DDD5|EEEE5',null,null,null,null,null)Insert Into @myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6',null,null,null,null,null)Insert Into @myTable Values ('AAA7|BBB7|CCCC7|DDD7|EEEE7',null,null,null,null,null)Insert Into @myTable Values ('AAA8|BBB8|CCCC8|DDD8|EEEE8',null,null,null,null,null)Set RowCount 1While Exists(Select 1 From @myTable Where Col1 is null)Begin Update Z Set Col1 = Y.Col1, Col2 = Y.Col2, Col3 = Y.Col3, Col4 = Y.Col4, Col5 = Y.Col5 From @myTable as Z Inner Join ( Select id, Col1, Col2, COl3, Col4 = left(myCol,charindex('|',myCol)-1), col5 = Right(myCol,len(myCol)-charindex('|',myCol)) From ( Select id, Col1, Col2, Col3 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From ( Select id, Col1, Col2 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From ( Select id, Col1 = left(myCol,charindex('|',myCol)-1), myCol = Right(myCol,len(myCol)-charindex('|',myCol)) From @myTable ) as A ) as B ) as C ) as Y On Z.id = Y.id Where Z.Col1 is nullEndSet RowCount 0Select * From @myTableYou may not have an id field, so use whatever is your primary key in place of id.Corey |
 |
|
|
samsekar
Constraint Violating Yak Guru
437 Posts |
Posted - 2004-08-16 : 10:10:12
|
quote: Originally posted by jsmith8858PLEASE don't use BCP or a cursor for this! this is a classic example of what update statements are designed for. I would expect an UPDATE statement would run about 100 times faster than using a cursor; maybe more.- Jeff
Hi Jeff, AFAIK BCP is faster for doing large operations! Can you please be so kind to explain why you suggest update instead of BCP in this case.- Sekar |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-08-16 : 10:10:41
|
| don't do it by setting the rowcount to 1 and looping -- update a good number of rows at a time. if you set the rowcount to 1 and loop, then you might as well use a cursor.- Jeff |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2004-08-16 : 10:12:05
|
Now that is just freaking cool... i did not know you could set variables like that in an update. *sigh*quote: Originally posted by jsmith8858
quote: ... and process only 100000 records, how do I then continue processing from that point forward?
notice the WHILE statement before the update -- that causes a loop. it will loop until no rows are left to update.To handle the case where the position of the pipe | characters changes, you can do it like this:-- this is a sample table w/ sample data:Create table myTable (myCol varchar(50), Col1 varchar(10), Col2 varchar(10), col3 varchar(10), col4 varchar(10), col5 varchar(10))goInsert Into myTable Values ('AA1|BBB1|CCCC1|DDD1|EEEE1', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA2|BBB2|CC2|DDD2|EEEE2', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAAAA3|BBB3|CCCC3|DDD3|EEEE3', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA4|BBB4|CCCC4|DD4|EEEE4', Null,Null,Null,Null,Null)Insert Into myTable Values ('A5|BBB5|CCCC5|DDD5|E5', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA6|BBB6|CCCC6|DDD6|EEEE6', Null,Null,Null,Null,Null)Insert Into myTable Values ('AAA7|BBB7|CCCCC7|DDD7|EEEE7', Null,Null,Null,Null,Null)Insert Into myTable Values ('AA8|BBBBBBB8|CCCC8|DDD8|EEEE8', Null,Null,Null,Null,Null)-- you will need to declare these variables:declare @i1 intdeclare @i2 intdeclare @i3 intdeclare @i4 int-- and here is your update statement:update myTable set @i1 = charindex('|', myCol), col1 = left(myCol, @i1-1), @i2 = charindex('|',myCol,@i1+1), col2 = substring(myCol, @i1+1, @i2-@i1-1), @i3 = charindex('|',myCol, @i2+1), col3 = substring(myCol, @i2+1, @i3-@i2-1), @i4 = charindex('|',myCol, @i3+1), col4 = substring(myCol, @i3+1, @i4-@i3-1), col5 = substring(myCol, @i4+1, 50)select * from myTableit is easy to do this operation in an update as opposed to a SELECT, because in an update you can set variables along the way which makes it pretty easy. In a SELECT, as corey demonstrates, the easiest way is to use nested subqueries.anyway, run the sample, make sure it makes sense, and try it out.- Jeff
Corey |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2004-08-16 : 10:13:52
|
I was only showing that it would loop through his set, but my set was only 8... I thought it was clear enough from earlier conversation that Rowcount should be 100000 or some such...quote: Originally posted by jsmith8858 don't do it by setting the rowcount to 1 and looping -- update a good number of rows at a time. if you set the rowcount to 1 and loop, then you might as well use a cursor.- Jeff
Corey |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-08-16 : 10:14:26
|
quote: Originally posted by samsekar
quote: Originally posted by jsmith8858PLEASE don't use BCP or a cursor for this! this is a classic example of what update statements are designed for. I would expect an UPDATE statement would run about 100 times faster than using a cursor; maybe more.- Jeff
Hi Jeff, AFAIK BCP is faster for doing large operations! Can you please be so kind to explain why you suggest update instead of BCP in this case.- Sekar
sure!the goal is he has table X which in which he wants to update 5 columns. he doesn't want to create a new table, he wants to update the data in 5 columns.option 1: update table X directlyoption 2: BCP out the table with the new columns generated, then BCP the table into the database into a temp table, and then update the table X joining to the temp table.Option 2 includes the same step that exists in option 1, plus the added steps of exporting and then importing 144 million rows. Or am I missing something?- Jeff |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-08-16 : 10:16:21
|
| it is pretty cool -- one the many things i learned here at SQLTeam. I wish you could do it in a standard SELECT, that would be REALLY cool. you can only do it in an UPDATE statement.- Jeff |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2004-08-16 : 10:17:36
|
quote: Originally posted by Seventhnight I was only showing that it would loop through his set, but my set was only 8... I thought it was clear enough from earlier conversation that Rowcount should be 100000 or some such...quote: Originally posted by jsmith8858 don't do it by setting the rowcount to 1 and looping -- update a good number of rows at a time. if you set the rowcount to 1 and loop, then you might as well use a cursor.- Jeff
Corey
just making sure ! - Jeff |
 |
|
|
tommynz
Starting Member
9 Posts |
Posted - 2004-08-16 : 10:21:54
|
| Wow, replys coming thick and fast. Having trouble trying to keep up.Thanks a lot for all your help guys.I have tried Jeffs update statement on a small set of data and it looks good. I will try it on the real table and let you know how it goes.Again thank you very muchRegardsMichael |
 |
|
|
Next Page
|
|
|
|
|