Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 DataWarehouse design

Author  Topic 

Cralis
Starting Member

11 Posts

Posted - 2009-03-18 : 19:55:11
I'm not that new to SQL, but I am really new to Data warehousing, and am looking for some assistance in my understanding.

If I am to build a datawarehouse that will be used for reporting purposes, where cubing (Another thing I am new to) would be useful, am I right in saying that the DDL for defining the Data Warehouse is exactly the same as when I design my OLTP, except, using a different rules (Analasys based, as opposed to Transaction based)?

I have a current database, which is pretty well normalised. I'd like to start off using one fact table, and then multiple dimension tables. Using the online examples I have found, basically a sales table, with the usual, product, time and store idea.

So, for the same, I will use this syntax:

CREATE TABLE dbo.[factSales]
(
id INT NOT NULL IDENTITY(1,1),
productid INT NOT NULL,
salesamount FLOAT NOT NULL,
dateid INT NOT NULL,
departmentid INT NOT NULL,
ordernumber VARCHAR(10) NOT NULL
)

I'd make 'id' the primary key, with dateid, productid and departmentid as foregin keys pointing to their dimension tables. I'd also add an index to the foreign keys, with departmentid being clustered.

The dimension tables (I'll only go into the Product one...) would be defined like this:

CREATE TABLE dbo.[dimProduct]
(
id INT NOT NULL IDENTITY(1,1), -- DO I need this? I assume so, to keep uniqueness.
importedId INT NOT NULL, -- Do we maybe keep the ID from the source data?
state VARCHAR(30), -- Eg, Queensland
region VARCHAR(50), -- Eg, South East
city VARCHAR(40), -- Eg, Brisbane
cityarea VARCHAR(20), -- Eg, Central
storename VARCHAR(30), -- Eg, Bracken Ridge
)

id would be the primary key (Linked back to the Fact table), with Indexes on all the other Varchar fields? Is that correct? As we could be grouping on these, and filtering on these, would a key be handy? I understand it will slow down the inserts, and increase the table size, but as this is a dimension table, we need fast access for lookups, and it won't be huge compared to the fact tables.

Is that how I'd create my datawarehouse? There is really nothing different from doing that, and designing an OLTP database. Except the mindset in understanding that we're not bothered about Inserts and Updates, and we're looking for lookup performance.

Is there anything else I need to do? Maybe, create the database in a different way? It seems really easy. Well, this is a basic example, but I'm trying to understand the basics here. The DB will be created using Management Studio... and will sit alongside the normal databases. I'm just concerned that it seems so similar to normal databases - except for the de-normalisation.

Any hints and critisisms would be fantastic.

Thanks guys!

slimt_slimt
Aged Yak Warrior

746 Posts

Posted - 2009-03-19 : 01:21:10
Hi,

when building DW you should be aware of the needs for your DW. If it's for reporting purposes you normally don't need any specific prepared DW structure in order for your reporting to run normally (except for the indexation and table normalization - for faster work). But if you are building OLAP Cubes, your DW should (must) be prepared in such way that is specialized for all cubes you are going to build.

In this case you will be looking at star scheme (which is to believe the simplest DW schema) of your tables. it this case you should design your fact table (in connection to dimension tables (usually code lists or end tables for supporting dimensions when builiding cube)) to support OLTP on daily basis as well as encompass all the relevant data you want to have in cube (usually it's max 10 dimensions and cca max 20 measures still for okey performance).

So build your DW based on star scheme, bringing a fact table (with all information you want to ETL from any other server/instance) as well as dimensions which are tables prepared for cube dimensions. and you should also bring information for measures on fact table.

once you have such DW build, you can easily use it also on reporting.
Go to Top of Page
   

- Advertisement -