Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
sqlserver_newbee
Starting Member
11 Posts |
Posted - 2007-08-02 : 16:19:03
|
Hi Folks,I am working on very large files [in order of 3 GB/file] with millions of records. We receive 20 such files each month that need to be entered in the data base. I have created a DTS Package that does following -1. Reading these fixed length files and loading them into the staging tables with primary key. 2. Execute SQL task that does upserting as - Update Actual_Table Set ()From Actual_TableInner Join Staging_TableONActual_Table.pk1=Staging_Table.pk1 ANDActual_Table.pk2=Staging_Table.pk2..INSERT INTO Actual_Table ()SELECT ()FROM Staging_TableWHERE NOT EXISTS(SELECT NULL FROM Actual_Table WHERE Actual_Table.pk1=Staging_Table.pk1 ANDActual_Table.pk2=Staging_Table.pk2..)Because of the file size and number of records, execution times have been in excess of 3 hours/file. I have 20 such files, so it takes 3-4 days to get them in the tables!!!Can someone please suggest me a way to optimize the steps I am taking? I do understand that with the given file size, the query processing is bound to have exceeding run times. But a small change in the query approach might make a big overall difference here. Also, one thing that needs to be factored is that the execution times will get worse each month because the actual table size will also grow with file dumps each month.Looking forward to suggestions and improvisations to my implementation.Thanks a lot for all your efforts! |
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2007-08-02 : 16:41:57
|
I'd suggest trying a LEFT OUTER JOIN instead of the NOT EXISTS. Depending on how your data is set up you might be able to do some work to minimize the amount of joining. For example, maybe you can delete from the Staaging table before trying to insert or something..? |
 |
|
sqlserver_newbee
Starting Member
11 Posts |
Posted - 2007-08-02 : 16:58:16
|
quote: Originally posted by Lamprey I'd suggest trying a LEFT OUTER JOIN instead of the NOT EXISTS. Depending on how your data is set up you might be able to do some work to minimize the amount of joining. For example, maybe you can delete from the Staaging table before trying to insert or something..?
I was doing the Delete earlier so basically it was like -update common recordsdelete common recordsinsert remaining recordsSomeone here suggested me to combine 2-3 and that saved a lot of time over my method [and also the one you suggest] of deleting the common records and then inserting the rest. I also understand why it worked better in my case.I'd try left outter join as a test and see if it saved me any time.Thanks! |
 |
|
|
|
|