Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Batch Insert

Author  Topic 

sccrsurfer
Starting Member

43 Posts

Posted - 2013-04-16 : 17:55:40
I have a query that is supposed to do batch inserts of TOP 5000 until @RowCount = 0. However, this loop just keeps going and going and going and...

Im trying to take data from a staging table and move only certain fields to a destination table, and each record needs to be inserted just one time. Fields may have duplicat values, an entire row must be unique (imagine if an entire row was concatenated).

Basically I want to go down the list of a source table, and batch insert certain fields into a destination table until @rowcount = 0.

The query is this:

DECLARE @RowCount int;

SET @RowCount = 1;

WHILE @RowCount > 0

BEGIN

INSERT INTO [DB].[dbo].[DB_DestName] (Field1, Field2, Field3)
SELECT TOP (5000) Field1, Field2, Field3
FROM [DB].[dbo].[DB_SourceName]

SET @RowCount = @@RowCount;

END;

The query only stops executing because it hits the database size limit! The source table has 7 million records and the destination table has 15.2 million records. This can't be the case unless the query is wrong and doing an INSERT for repeated data.

Any insight would help greatly. Thanks!

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-04-16 : 18:11:34
The query simply selects the top 5000 rows from the source table every time it goes through the loop. So you have to do something to make sure that the rows that have been visited do not get picked up again.

An efficient way to do this might be to use SSIS or Import/Export wizard, rather than trying to construct your own loop. You can control the batch size when you do that, and it is very efficient at moving data from one table to another. To use Import/Export wizard, right-click on the database name in SSMS object explorer and select Tasks -> Import Data and follow through the wizard's instructions.
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2013-04-16 : 18:16:30
What you have there will just keep inserting the TOP 5000 random rows from DB_SourceName. You need some way to limit the rows. There are several ways you could do that. One might be to use a ID, assuming the table has one or to use the ROW_NUMBER() funciton to generate a number. Antoher way, is to use an OUTER JOIN or NOT EXITS on the destination so you won't insert duplicate rows.. etc..

Here are some links on how to do that kinda of thing:
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/7d7c91a6-0484-42a5-b12f-fcbc6aed0c71/
http://stackoverflow.com/questions/4729697/break-up-a-sql-server-2008-query-into-batches
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-04-17 : 00:37:04
What you could do is to add an identity column to staging table and then loop based on that. i still didnt get reason for doing 5000 inserts per batch. Is it because of large data volume? At the end of each population you may truncate the table and prepare it for next data load so that identity will again get reset.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

sccrsurfer
Starting Member

43 Posts

Posted - 2013-04-17 : 11:52:49
I tried the SSIS route, and it was so easy to set up, but unfortunately it errored out after the first 1,071 rows and I can't figure out why. I even have a data conversion process between the source and destination (nvarchar to int), and the process stops there.

So no I'm running the process with the query above in batches of 5000, but using a
left join, ON source.XYZ = dest.XYZ where dest.XYZ IS NULL;

This query is running super slow. I'm on the 45 minute mark and it's not even half way through! Any suggestions? I have to do it in batches by the way because the DBA's got a little PO'd :D
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2013-04-17 : 13:03:40
If the only conversion you are doing is a VARCHAR to INT, then I'd suggest you check your souce table for any VARCHAR values that don't/won't convert to INT.
Go to Top of Page

MIK_2008
Master Smack Fu Yak Hacker

1054 Posts

Posted - 2013-04-17 : 13:11:03
quote:
Originally posted by sccrsurfer

INSERT INTO [DB].[dbo].[DB_DestName] (Field1, Field2, Field3)
SELECT TOP (5000) Field1, Field2, Field3
FROM [DB].[dbo].[DB_SourceName]




Any of Field1, Field2 and Field3 is unique? or a combination of any two? or all three together makes it unique for a single record? If none, then is there any unique key/field in the DB_SourceName table? if source table does .. then use its unique key for looping purposes


Cheers
MIK
Go to Top of Page
   

- Advertisement -