Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 ssis package takes long to run

Author  Topic 

basicconfiguration
Constraint Violating Yak Guru

358 Posts

Posted - 2010-02-11 : 06:27:25
I want to replicate 50 tables from Oracle to SQL Server 2005 then have an daily incremental extract. These tables are really large, some of them have 500M records. I'm using SSIS but it takes very long to pull all the data from just one table, it takes 3-4 hours for the 500M records. My question is how could I speed it up? It's really a pain in the ass to wait for the SSIS to run and finish. Sometimes I'm not even sure if the SSIS package is freezed up as it takes really long.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-02-11 : 06:30:11
Are you using Oracle attunity connector? It provides much better performance

http://www.microsoft.com/downloads/details.aspx?FamilyId=D9CB21FE-32E9-4D34-A381-6F9231D84F1E&displaylang=en

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-02-11 : 11:27:12
There are probably whole books that could be written just on this topic. But, there are several things you can do:
  • Set the Rows per Batch and the Maximum insert commit size so that you are not committing an entire table at once. You'll have to play around with your system to see what setting is best/better. Try some different values in the range from 10,000 to 1,000,000 and see what happens.
  • If you can do Delta processing instead of pumping the entire table across, that could gain you quite a bit.
  • If the destination tables are indexed (heavily) you might seem some gains by dropping the index, loading the data and then re-created the indexes.


Go to Top of Page

rds207
Posting Yak Master

198 Posts

Posted - 2010-02-11 : 14:52:02
I am not sure if i am still at the stage to answer a question in this forum, because i am still a beginner , but just want to share an idea as i have gone through the same thing recently , for the first time import instead of creating an ssis packages , i did an direct import of the table which is much faster than running a package, once the table is imported then create an ssis packages for the table which updates the new records into the table ....(you need apply a logic here to get new records) .This actually worked pretty fast for me ...

quote:
Originally posted by basicconfiguration

I want to replicate 50 tables from Oracle to SQL Server 2005 then have an daily incremental extract. These tables are really large, some of them have 500M records. I'm using SSIS but it takes very long to pull all the data from just one table, it takes 3-4 hours for the 500M records. My question is how could I speed it up? It's really a pain in the ass to wait for the SSIS to run and finish. Sometimes I'm not even sure if the SSIS package is freezed up as it takes really long.

Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-11 : 15:10:13
We use staging tables at both ends - a SQL box near to the source (Oracle in your case) and on the target server.

Where the Oracle tables have a change-date/time column we use that to query "newer than last time we asked" records; we store them in a #TEMP table, and update the staging table based on a full WHERE clause that compares columns to see which rows are different but only in columns we are interested in.

(Where a change-date/time column does not exist we query the whole Oracle table [but because the SQL box is nearby the performance is OK] and then do the selective update from #TEMP as above)

Rows that are different update the Staging table. New rows are inserted (again, we only store columns we are interested in).

Each row in the staging table has a ChangeDate/Time column.

At target end we get the MAX(ChangedDate) for the staging table (at that end) and SELECT all rows from Source Server's Staging Table with a newer date INTO a #TEMP table. We predelete any that already match and INSERT all the rows from #TEMP.

At that point, on Target Server, we have a staging table with all newer rows having a ChangeDate > previous MAX ChangedDate.

We then "upsert" those rows into the actual tables (which usually requires some manipulation of the data as the Oracle data is for a different purpose)
Go to Top of Page
   

- Advertisement -