| Author |
Topic |
|
alasse130
Starting Member
21 Posts |
Posted - 2011-07-06 : 05:19:55
|
| We are currently implementing export/import functionality in our application. We are exporting data to csv file. I'm just wondering if there's an alternative and faster way to using Bulk Insert when importing data from a csv file. |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-07-06 : 06:59:18
|
| You could use SSIS (SQL Server Integration Services), or depending on your needs, Import/Export Wizard to create a DTS package. I don't know if it would be any faster, but it is simple enough to do, so may be worth trying. In SSMS object explorer, right-click on a database, and select Tasks->Export data or Import Data and follow the wizard directions. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-07-06 : 07:52:43
|
| There is bcp, OPENQUERY, BULK INSERT, SSIS-packages...many ways to achieve what you want, but they're more or less equally fast (but poor configuration / use will always make things slow). It's more a question of preference and where the process is initiated from...- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
 |
|
|
alasse130
Starting Member
21 Posts |
Posted - 2011-07-06 : 08:55:17
|
| Hmm, we actually need it to be initiated from our web application. I'm wondering if there's an alternative SQL command to bulk insert. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-07-06 : 09:17:26
|
| Are you having performance problems or any other kinds of problems? What kind of files are you importing/exporting and how big are they?- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
 |
|
|
alasse130
Starting Member
21 Posts |
Posted - 2011-07-06 : 09:39:33
|
| Yes, the stored proc we're using is kinda slow. First, we have an import button which does a bulk insert to an Import table from a specified csv file. Inside the same sp as well, I'm validating some fields specifically the foreign key fields as they are specified as text and I'm converting them to their corresponding ids. I'm marking those with invalid values as invalid by noting it in an ErrorMsg column. There are also rows which will be marked as to be inserted. After this operation has completed, the user will now be presented the option to commit the changes. In the commit sp, I'm getting those valid rows from the Import table and updating the actual table where the data was exported from. Currently, we have just tested this with a csv file containing less than 50 rows where there are actually less than 10 rows having changes in it. I'm wondering if there's a different and more efficient way of doing this. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-07-07 : 02:01:05
|
| Well...I have to say that if you're having performance problems bulk inserting a file with 50 rows in it, it has nothing to do with the bulk insert itself. If it's just a matter of a standard inserting, I could probably do close to a million rows per minute on my laptop. In other words; I think the bottleneck is elsewhere (I would guess in your data manipulation).What you should probably do is some debugging of your procedures, doing things in a controlled manner one step at a time to see which part takes time. If you post the parts that are slow here we'll be glad to help you performance tune it. - LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
 |
|
|
alasse130
Starting Member
21 Posts |
Posted - 2011-07-07 : 03:45:33
|
| Thinking about it, I think my performance problem lies more on committing the changes from the data imported. I have an Import table with the following columns:ImportID - PKBatchID - to differentiate batch of imported dataToUpdate - Mark as 1 or NULL, 1 means there's a change in the data that must be applied to the actual tableErrorMsg - Where error message are written toCode - string representation of a foreign keyId - corresponding ID of the codeI have a lot of Code and Id fields. Meaning the data imported from csv are code and inside the import stored procedure are translated to their corresponding ids. Here's my update statement which commits the changes from the data imported:UPDATE ASET InScope = CASE (Imp.InScope) WHEN 'FALSE' THEN 0 ELSE 1 END , ScheduledDate = Imp.ScheduledDate , PrimaryContact_ID = Imp.PrimaryContactId , Plan_ID = CASE WHEN Imp.InScope = 'TRUE' THEN Imp.PlanId ELSE Plan_ID ENDFROM Import AS Imp INNER JOIN Appointment AS A ON Imp.AppointmentID = a.IDWHERE Imp.[BatchID] = @batchID AND Imp.[ToUpdate] = 1Note that I just didn't show the other columns. What optimization could you further suggest? Ive tried adding separate nonclustered index for the BatchID column (DESC) and AppointmentID but there's not much effect after applying it. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-07-07 : 04:14:32
|
| I don't mean to be rude or anything but instead of "Thinking about it, I think my performance problem lies more on committing the changes from the data imported" why don't you do some testing and you'll know for sure. Doing random troubleshooting probably isn't gonna do you any good. The query you posted looks ok and unless the Appointment and import tables are huge with no index on ID/AppontmentID it should be pretty fast. Do you run this in a loop (cursor) or something? Find out for for sure where the bottleneck is...only then I'll be able to provide some real help.- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
 |
|
|
alasse130
Starting Member
21 Posts |
Posted - 2011-07-07 : 04:38:41
|
| I mean yes, the commit does executes a long time. My initial question was about an alternative to bulk insert because I'm thinking if there is I could modify the overall implementation of the importing and exporting. I'm also not an expert on reading execution plans. I'm confused now because other suggestion I got was to put index on id and appointmentid. |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-07-07 : 04:51:14
|
quote: I'm confused now because other suggestion I got was to put index on id and appointmentid.
That is the same as I was trying to say.- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
 |
|
|
alasse130
Starting Member
21 Posts |
Posted - 2011-07-07 : 05:00:50
|
| Right, I got confused. It doesn't add much effect though. UPDATE ASETInScope = CASE (Imp.InScope) WHEN 'FALSE' THEN 0 ELSE 1 END, ScheduledDate = Imp.ScheduledDate, PrimaryContact_ID = Imp.PrimaryContactId, Plan_ID = CASE WHEN Imp.InScope = 'TRUE' THEN Imp.PlanId ELSE Plan_ID ENDFROMImport AS ImpINNER JOIN Appointment AS A ON Imp.AppointmentID = a.IDWHEREImp.[BatchID] = @batchIDAND Imp.[ToUpdate] = 1and Obj_ErrorMsg IS NULLAdditional info: Appointment table is huge.Will changing the inner join to a select statement from the Appointment table improve performance? WHERE Imp.ApointmentID EXISTS IN (SELECT {columns} FROM Appointment}Any other columns I need to index? |
 |
|
|
HaYen
Starting Member
6 Posts |
Posted - 2011-07-08 : 00:31:17
|
| SSIS has a folder object so when the CSV file is "dropped off in the folder it can be imported. As a side note, if you ever did do a large import of data into a table. That same table had a number of non-Clustered Indexes, drop the index's, perform the import, rebuild the indexes. |
 |
|
|
|