Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 Import flat excel structure into hierarchy.

Author  Topic 

DaveChapman
Starting Member

7 Posts

Posted - 2009-04-27 : 10:41:14
I need to import an excel spreadsheet into a SQL database.

The spreadsheet is a "flat" organisation structure e.g.
Branch No, Branch Name , Area No, Area Name, Region No, Region Name, Division No, Division Name.

The resulting ORG table needs to be hierarchical e.g.
No, Name, Level, Parent, Status
(where level is Branch, Area etc)

With a second table ORG_HIERARCHY holding the relationship e.g.
ChildNo, ParentNo, Status.


I have done this succesfully (quickly and easily) using ADO datasets in C# for a console application, but our DBA's are now dragging their heels about putting an exe on their server, so I want to try to move this into an SSIS pacackage.


However, I've hit a few problems:
Because the spreadhseet changes every month, rows may no longer be present. If they have been removed then I need to mark the matching row in the database Inactive. If the row is new, then a new row is inserted, if the row is already present then it is left Active.
In addition, a row's data may change, so that a branch belongs to a different area. In that case, the hierarchy table sets a row Inactive (as we need this for ref integrity) and a new Active row is added.
In C#, this is so fast for an in-memory recordset that I can loop through every spreadsheet row and split it out into Branches, Areas etc...and then loop through every Area and see if it already exists...even though I know the vast majority will already be present from the flat structure.


I can't work out in SSIS how to do this kind of logic: e.g.

Set all rows in ORG Inactive.
If branch exists in spreadsheet and in ORG then set row Active.
If branch exists in spreadsheet and not in ORG then add new Active row, but only take Branch No and Branch Name.
Otherwise leave Inactive.

Repeating this for the heirarchy, noting that all the rows are "flat" in the spreadsheet.

I've tried recordsets in SSIS, but they don't have an output and I can't use Raw files because I can't write to the servers hdd.

I can conditionally split the data...but since the rows are flat, one row only gets processed one time...unless I read in the spreadsheet multiple times, which is slow, or create temporary tables in the db...which I can't do :(

Even if I did this, I still can't see how to write in SSIS: Do x unless y already exists in which case do z.


Any ideas very gratefully accepted.

Qualis
Posting Yak Master

145 Posts

Posted - 2009-04-27 : 14:32:02
You can take your entire code and put it into a Script task. This isn't the BEST solution, but it may be the easiest and is still compiled code.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-04-27 : 15:07:37
I would break this into two tasks.

1. Figure out to import the data into a staging area the best way
2. Write a stored procedure which deals with the hierarchial thingies.



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

DaveChapman
Starting Member

7 Posts

Posted - 2009-04-28 : 03:29:06
Cheers, both good ideas except that I have very little VB expererience for the Script option, and I'm struggling to find a way of staging the data that doesn't require writing to the hdd or creating temp tables.
Go to Top of Page
   

- Advertisement -