Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Query Optimization

Author  Topic 

sqlserver_newbee
Starting Member

11 Posts

Posted - 2007-08-02 : 16:19:03
Hi Folks,

I am working on very large files [in order of 3 GB/file] with millions of records. We receive 20 such files each month that need to be entered in the data base.

I have created a DTS Package that does following -
1. Reading these fixed length files and loading them into the staging tables with primary key.

2. Execute SQL task that does upserting as -

Update Actual_Table
Set ()
From Actual_Table
Inner Join Staging_Table
ON
Actual_Table.pk1=Staging_Table.pk1 AND
Actual_Table.pk2=Staging_Table.pk2
..

INSERT INTO Actual_Table ()
SELECT ()
FROM Staging_Table
WHERE NOT EXISTS(SELECT NULL FROM Actual_Table WHERE
Actual_Table.pk1=Staging_Table.pk1 AND
Actual_Table.pk2=Staging_Table.pk2
..)


Because of the file size and number of records, execution times have been in excess of 3 hours/file. I have 20 such files, so it takes 3-4 days to get them in the tables!!!

Can someone please suggest me a way to optimize the steps I am taking? I do understand that with the given file size, the query processing is bound to have exceeding run times. But a small change in the query approach might make a big overall difference here.

Also, one thing that needs to be factored is that the execution times will get worse each month because the actual table size will also grow with file dumps each month.

Looking forward to suggestions and improvisations to my implementation.

Thanks a lot for all your efforts!

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2007-08-02 : 16:41:57
I'd suggest trying a LEFT OUTER JOIN instead of the NOT EXISTS. Depending on how your data is set up you might be able to do some work to minimize the amount of joining. For example, maybe you can delete from the Staaging table before trying to insert or something..?
Go to Top of Page

sqlserver_newbee
Starting Member

11 Posts

Posted - 2007-08-02 : 16:58:16
quote:
Originally posted by Lamprey

I'd suggest trying a LEFT OUTER JOIN instead of the NOT EXISTS. Depending on how your data is set up you might be able to do some work to minimize the amount of joining. For example, maybe you can delete from the Staaging table before trying to insert or something..?



I was doing the Delete earlier so basically it was like -
update common records
delete common records
insert remaining records

Someone here suggested me to combine 2-3 and that saved a lot of time over my method [and also the one you suggest] of deleting the common records and then inserting the rest. I also understand why it worked better in my case.

I'd try left outter join as a test and see if it saved me any time.

Thanks!
Go to Top of Page
   

- Advertisement -