Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Database Design and Application Architecture
 Best Practice for mart availability during loading

Author  Topic 

mrimmele
Starting Member

1 Post

Posted - 2013-11-01 : 07:08:20
Hi, This is my first post ever so if I do something silly, please forgive me (and let me know so I don't do it next time).

Here's our situation: we have a data mart in SQL Server 2008 that because of scheduling dependencies (that we can't control) is getting loaded in the middle of the business day. The users are already getting 1 day latency on their data and don't really want to incur another day (if we can help it). So we're trying to design a process that would allow us to load the data mart within the current time schedule but minimize the downtime that the users experience. Wondering if other shops have run into this and how the have designed their solution.

Was thinking of doing something like this:

Have 2 schemas set up to support the process. One schema could carry an exact structural copy of the "live" datamart tables. Load process does a truncate and load, so these tables would be empty. The second schema would be empty.

Could then run the load process on the empty tables (users would be able to query "live" tables during this process). Once that load is complete could then Switch schemas (move the "live" tables to the empty schema, then move the newly loaded tables into the "live schema). The users would potentially only experience an interruption during the switching of the schemas.

Is this a completely absurd solution for this issue? Have we made it way too complicated?? Don't particularly like having to maintain the data structure for both the live tables and the tables used for loading.

Comments are much appreciated. Thanks!

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2013-11-01 : 11:28:49
you first need to define/identify the business goals in terms of availability, latency, recovery, etc and then bring the business folks back to reality for what is possible with the resources they are willing to budget. You need work out these specs before deciding on a design.

For us we've gone mostly home-grown and ended up doing this type of thing a couple of different ways:

1. (This is NOT the industry standard but it has been successful and working for years - initially implemented about 15 years ago!)
We needed to completely rebuild a set of (moderately sized) reporting tables. The tables included ranking columns for all possible sort options so it wasn't practical to incrementally update the data. We created a set of 3 databases (DB, DB_0, DB_1). DB just had Views and functions and procedures and one meta table to indicate which 0/1 DB was "live". That is the DB that the frontend applications are aware of. We reload the the non-live DB. Then we would flip live/non-live DBs by, within a transaction, ALTER the views to point to the non-live DB and update the meta table making it "live". Now we would have the option to use synonyms rather than views - but basically same result.

2.
We have more recently created very large datasets that have new data (and some changes) coming in all the time. We keep replicated "rollup" versions of these tables which are maintained on rollup servers. (commit-to server -> rollup servers -> frontend servers) The reporting structure is somewhat different from the commit-to structures. We have procedures to update any changed entities to staging tables and then only update the actual changed data on the rollup server tables. That way we minimize the replication transactions to only what actually changed. These incremental rollup procedures are constantly looking for new commit-to data so the frontend is (almost) real time and never unavailable (theoretically).

Be One with the Optimizer
TG
Go to Top of Page

jeffw8713
Aged Yak Warrior

819 Posts

Posted - 2013-11-05 : 13:32:10
Think about using synonyms to swap access between the two schemas. Once the load process is finished, you would then drop the synonyms and recreate them on the opposite tables.

Go to Top of Page
   

- Advertisement -