SqlBulkCopy to table with clustered PK

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

SqlBulkCopy to table with clustered PK

Author

Topic

amsimms
Starting Member

2 Posts

Posted - 2009-04-26 : 11:08:16

Other methods of bulk insertion allow the specification of an ORDER so that rows can be inserted efficiently into a table with a clustered index. Is there a way to do this with SqlBulkCopy? Perhaps it is impled if the source is a DataTable properly ordered or if using an IDataReader that returns rows in the correct order, but I'm not seeing this in BOL.

Creating the key afterwards is how I doing this now. However, these are extremely large tables (1.1 billion rows), so I'd rather pay a little more during the insert rather than wait for a big index build.

Thanks!

--Andrew

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2009-04-28 : 23:28:56

For tables this size, I find that it's usually better to build the index afterward, unless the data you are sending to bcp is already ordered. I don't think SqlBulkCopy supports using bulk load hints like bcp or BULK INSERT though. So if you can't use bcp/bulk insert, and have to use SqlBulkCopy, you are better off importing rows to a heap and then building the index after load is complete.

also, you don't want to use DataTable for this - if you do that, you'll be bringing all 1.1b rows into memory on the client, which is painful and will likely cause OOM on the client. IDataReader is the way to go. Better yet, just use bcp and the ORDER hint. :)

elsasoft.org

amsimms
Starting Member

2 Posts

Posted - 2009-04-29 : 12:07:08

name rows reserved data index_size unused
Coord_4683 537500000 24685040 KB 24431824 KB 190384 KB 62832B

Using SqlBulkCopy with DataTable (chunked into 2,500,000 rows) I'm here after about 12 hours, my previous benchmark was to get around 1,000,000,000 per minute into a heap mode table with slightly smaller data.

It appears that using an ordered DataTable and SqlBulkCopy does has a slight penalty (~20-~25%) but otherwise seems to work very well. I do the data in chunks, about 250,000,000 million rows at a time for allow for easy restarts as well as keeping the client program relatively small. It can take 6-8 hours to build an index from scratch on one of these tables, so 25% is quite acceptable. However, I still wonder if the ORDER / UNIQUE hints, if available, could squeeze out any more.

An interesting comparison will be to try bcp (as jezemine suggests) using native format.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2009-04-29 : 12:23:05

quote:
my previous benchmark was to get around 1,000,000,000 per minute

1 billion rows per minute? That's nice, 16 million rows a second. Might want to let Microsoft know, I'm sure that's a record.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources