Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Alternative to Bulk Insert

Author  Topic 

alasse130
Starting Member

21 Posts

Posted - 2011-07-06 : 05:19:55
We are currently implementing export/import functionality in our application. We are exporting data to csv file. I'm just wondering if there's an alternative and faster way to using Bulk Insert when importing data from a csv file.

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-07-06 : 06:59:18
You could use SSIS (SQL Server Integration Services), or depending on your needs, Import/Export Wizard to create a DTS package. I don't know if it would be any faster, but it is simple enough to do, so may be worth trying. In SSMS object explorer, right-click on a database, and select Tasks->Export data or Import Data and follow the wizard directions.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-07-06 : 07:52:43
There is bcp, OPENQUERY, BULK INSERT, SSIS-packages...many ways to achieve what you want, but they're more or less equally fast (but poor configuration / use will always make things slow). It's more a question of preference and where the process is initiated from...

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

alasse130
Starting Member

21 Posts

Posted - 2011-07-06 : 08:55:17
Hmm, we actually need it to be initiated from our web application. I'm wondering if there's an alternative SQL command to bulk insert.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-07-06 : 09:17:26
Are you having performance problems or any other kinds of problems? What kind of files are you importing/exporting and how big are they?

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

alasse130
Starting Member

21 Posts

Posted - 2011-07-06 : 09:39:33
Yes, the stored proc we're using is kinda slow. First, we have an import button which does a bulk insert to an Import table from a specified csv file. Inside the same sp as well, I'm validating some fields specifically the foreign key fields as they are specified as text and I'm converting them to their corresponding ids. I'm marking those with invalid values as invalid by noting it in an ErrorMsg column. There are also rows which will be marked as to be inserted. After this operation has completed, the user will now be presented the option to commit the changes. In the commit sp, I'm getting those valid rows from the Import table and updating the actual table where the data was exported from. Currently, we have just tested this with a csv file containing less than 50 rows where there are actually less than 10 rows having changes in it.
I'm wondering if there's a different and more efficient way of doing this.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-07-07 : 02:01:05
Well...I have to say that if you're having performance problems bulk inserting a file with 50 rows in it, it has nothing to do with the bulk insert itself. If it's just a matter of a standard inserting, I could probably do close to a million rows per minute on my laptop. In other words; I think the bottleneck is elsewhere (I would guess in your data manipulation).

What you should probably do is some debugging of your procedures, doing things in a controlled manner one step at a time to see which part takes time. If you post the parts that are slow here we'll be glad to help you performance tune it.



- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

alasse130
Starting Member

21 Posts

Posted - 2011-07-07 : 03:45:33
Thinking about it, I think my performance problem lies more on committing the changes from the data imported.

I have an Import table with the following columns:
ImportID - PK
BatchID - to differentiate batch of imported data
ToUpdate - Mark as 1 or NULL, 1 means there's a change in the data that must be applied to the actual table
ErrorMsg - Where error message are written to
Code - string representation of a foreign key
Id - corresponding ID of the code

I have a lot of Code and Id fields. Meaning the data imported from csv are code and inside the import stored procedure are translated to their corresponding ids.
Here's my update statement which commits the changes from the data imported:

UPDATE A
SET
InScope = CASE (Imp.InScope) WHEN 'FALSE' THEN 0 ELSE 1 END
, ScheduledDate = Imp.ScheduledDate
, PrimaryContact_ID = Imp.PrimaryContactId
, Plan_ID = CASE WHEN Imp.InScope = 'TRUE' THEN Imp.PlanId ELSE Plan_ID END
FROM
Import AS Imp
INNER JOIN Appointment AS A ON Imp.AppointmentID = a.ID
WHERE
Imp.[BatchID] = @batchID
AND Imp.[ToUpdate] = 1

Note that I just didn't show the other columns. What optimization could you further suggest? Ive tried adding separate nonclustered index for the BatchID column (DESC) and AppointmentID but there's not much effect after applying it.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-07-07 : 04:14:32
I don't mean to be rude or anything but instead of "Thinking about it, I think my performance problem lies more on committing the changes from the data imported" why don't you do some testing and you'll know for sure. Doing random troubleshooting probably isn't gonna do you any good.

The query you posted looks ok and unless the Appointment and import tables are huge with no index on ID/AppontmentID it should be pretty fast. Do you run this in a loop (cursor) or something? Find out for for sure where the bottleneck is...only then I'll be able to provide some real help.

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

alasse130
Starting Member

21 Posts

Posted - 2011-07-07 : 04:38:41
I mean yes, the commit does executes a long time. My initial question was about an alternative to bulk insert because I'm thinking if there is I could modify the overall implementation of the importing and exporting. I'm also not an expert on reading execution plans. I'm confused now because other suggestion I got was to put index on id and appointmentid.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-07-07 : 04:51:14
quote:
I'm confused now because other suggestion I got was to put index on id and appointmentid.
That is the same as I was trying to say.

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

alasse130
Starting Member

21 Posts

Posted - 2011-07-07 : 05:00:50
Right, I got confused. It doesn't add much effect though.

UPDATE A
SET
InScope = CASE (Imp.InScope) WHEN 'FALSE' THEN 0 ELSE 1 END
, ScheduledDate = Imp.ScheduledDate
, PrimaryContact_ID = Imp.PrimaryContactId
, Plan_ID = CASE WHEN Imp.InScope = 'TRUE' THEN Imp.PlanId ELSE Plan_ID END
FROM
Import AS Imp
INNER JOIN Appointment AS A ON Imp.AppointmentID = a.ID
WHERE
Imp.[BatchID] = @batchID
AND Imp.[ToUpdate] = 1
and Obj_ErrorMsg IS NULL

Additional info: Appointment table is huge.

Will changing the inner join to a select statement from the Appointment table improve performance?
WHERE Imp.ApointmentID EXISTS IN (SELECT {columns} FROM Appointment}

Any other columns I need to index?
Go to Top of Page

HaYen
Starting Member

6 Posts

Posted - 2011-07-08 : 00:31:17
SSIS has a folder object so when the CSV file is "dropped off in the folder it can be imported. As a side note, if you ever did do a large import of data into a table. That same table had a number of non-Clustered Indexes, drop the index's, perform the import, rebuild the indexes.
Go to Top of Page
   

- Advertisement -