Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
amsimms
Starting Member
2 Posts |
Posted - 2009-04-26 : 11:08:16
|
| Other methods of bulk insertion allow the specification of an ORDER so that rows can be inserted efficiently into a table with a clustered index. Is there a way to do this with SqlBulkCopy? Perhaps it is impled if the source is a DataTable properly ordered or if using an IDataReader that returns rows in the correct order, but I'm not seeing this in BOL.Creating the key afterwards is how I doing this now. However, these are extremely large tables (1.1 billion rows), so I'd rather pay a little more during the insert rather than wait for a big index build. Thanks!--Andrew |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2009-04-28 : 23:28:56
|
For tables this size, I find that it's usually better to build the index afterward, unless the data you are sending to bcp is already ordered. I don't think SqlBulkCopy supports using bulk load hints like bcp or BULK INSERT though. So if you can't use bcp/bulk insert, and have to use SqlBulkCopy, you are better off importing rows to a heap and then building the index after load is complete.also, you don't want to use DataTable for this - if you do that, you'll be bringing all 1.1b rows into memory on the client, which is painful and will likely cause OOM on the client. IDataReader is the way to go. Better yet, just use bcp and the ORDER hint. :) elsasoft.org |
 |
|
|
amsimms
Starting Member
2 Posts |
Posted - 2009-04-29 : 12:07:08
|
| name rows reserved data index_size unusedCoord_4683 537500000 24685040 KB 24431824 KB 190384 KB 62832BUsing SqlBulkCopy with DataTable (chunked into 2,500,000 rows) I'm here after about 12 hours, my previous benchmark was to get around 1,000,000,000 per minute into a heap mode table with slightly smaller data.It appears that using an ordered DataTable and SqlBulkCopy does has a slight penalty (~20-~25%) but otherwise seems to work very well. I do the data in chunks, about 250,000,000 million rows at a time for allow for easy restarts as well as keeping the client program relatively small. It can take 6-8 hours to build an index from scratch on one of these tables, so 25% is quite acceptable. However, I still wonder if the ORDER / UNIQUE hints, if available, could squeeze out any more.An interesting comparison will be to try bcp (as jezemine suggests) using native format. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2009-04-29 : 12:23:05
|
quote: my previous benchmark was to get around 1,000,000,000 per minute
1 billion rows per minute? That's nice, 16 million rows a second. Might want to let Microsoft know, I'm sure that's a record. |
 |
|
|
|
|
|
|
|