Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Optimizing load jobs

Author  Topic 

photond
Starting Member

20 Posts

Posted - 2013-10-18 : 17:40:39
Hi, so I'm new to SQL Server and I recently started at a new job. We currently have about 50 jobs that run throughout the night that pull from a linked server and insert into our data warehouse.

Most of the jobs take a few minutes, but there are some that take several hours. The jobs are set up through SQL Agent as T-SQL jobs. We truncate the tables in our data warehouse daily and then load the records. Is there a more efficient way to do these loads in SSIS or a way to only load the changes that occur from the previous day?

Here's an example:
insert into dbo.claimdetail fields
select fields
from LinkedServer.prod1.dbo.clmdet
where exists (select * from LinkedServer.prod1.dbo.ClaimsToPull
where blah blah)

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2013-10-20 : 03:50:58
problem might be related to logic you've implemented. Also linked server connection can also increase query response times.
We might need to check queries for that.
What you could so is to analyze execution plans and understand bottlenecks.
a way to only load the changes that occur from the previous day
This is possible if you've an audit column (like datecreated,datemodified) in table. You just need to retrieve latest audit column value from your destination table each day and use a query like

SELECT * FROM SourceTable WHERE AuditColumn > (Max Value retrieved)


to retrieve only new records after last job execution.

Or if you've a Unique sequential primary key (IDENTITY column) Then also you can capture MAX(ID) from destination and use logic ID > MaxValue


------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2013-10-21 : 16:38:15
Truncating and loading is probably the most inefficient way to do things depending on the amount of change that is happening in the source. Also, if you are using linked servers, hopefully, you are not doing any sort of join between the two servers, as that can lead to some major performance issues.

As you mentioned SSIS can be used and it may offer better performance. Can Visakh has shown, if you have some value or date in your destination table that you can use to filter the source to get the changes rows, then that can be a decent option to only move new/changes rows instead of all the rows.

My only real advice here is to test out several scenarios and see what shakes out based on your needs and environment.
Go to Top of Page
   

- Advertisement -