Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Other SQL Server Topics (2005)
 Coming to grips with Data Warehouses

Author  Topic 

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-18 : 16:29:06
Hi again,

How's everyone doing with 2005? I still feel like my heals are dragging a bit. The "knowledge" out there still seems a bit sparse. Am I right?

I'm about 1/5 the way through a BI book and decided the course will probably be necessary as well:
http://www.kimballgroup.com/html/kucourseMDWD.html
There seems to be a lot to come to grips with.
So that's all paid for.

It's been years since I've been on a course of any kind. Too busy working! What's the etiquette? Trousers and T-shirt? Tie? Sandles & three quarter lengths?

This particular course seems to cover the "general overall approach" - which I am pretty sure is a necesary part of what you need to be able to do. The book that is related to the course (which I have already bought) seems really helpful and valuable.

But I don't think it will cover the nitty gritty of SSIS - where most of the grunt work happens. They have another course for that. But I can't wait that long. You can spend your whole life just reading books!

Are all the new little things in 2005 SSIS just a little to specialised? A little too clever? Does it limit us? Or empower us? Can we use it to get the job done? Or will it be a case again of "clever workarounds" when we travel far down a road that we find out (when it's too late and we are committed) can't provide us with what we want. Is it really saving us time? Are we better off writing the scripts ourselves?


------------------------
Me: What do you want to know from your data warehouse?
Client: Err...Emm...Everything
Me: OK, that's great. That's all I need to know. I'll see you when it's done.

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2007-03-18 : 16:59:54
Are we better off writing the scripts ourselves?

Yes!

That is my opinion of course.

rockmoose
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-03-19 : 13:11:18
Kimbal will show you how to create data marts and star schemas. If you want to create data warehouses, read books by Inmon.

e4 d5 xd5 Nf6
Go to Top of Page

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-20 : 08:16:57
So do you agree with this comment then Blindman?
http://www.1keydata.com/datawarehousing/inmon-kimball.html

Man I had no idea about all of this!
Just read the Wikipedia entry for Bill:
http://en.wikipedia.org/wiki/Bill_Inmon

Looks like I will be doing a lot of reading for a long time to come...
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2007-03-20 : 08:42:50
I've read a couple of Kimball books - very recently the microsoft toolkit.
It seems to be about the problems they found when learning to use the tools and how to convince clients that you are doing a good job as a consultant and how their process helps you to include the business people in the implementation.

The last two are good (and that's what their methodology is really all about) - as to the technical bits - I wouldn't rely on it, expecially when talking about performance.
It's mainly propoganda which is what you would expect - and maybe isn't a bad thing. Managers tend to treat it like xml. Mention the keyword and you must be doing the right thing.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-20 : 09:55:54
Well, up until when you said that, I was quite confident in my selection of SQL Server 2005 as a technical provider because of what I read here:
http://www.1keydata.com/datawarehousing/olap-market-share.html
http://www.olapreport.com/market.htm

and quite confident about Kimball as a methodology provider because of reading things like these:

"The two approaches, Inmon and Kimball are complimentary. Kimball is not "BS" and "stupid" as stated by one reviewer. In fact, it is the PRACTICAL techniques and elegant SIMPLICITY that makes Kimball's work so valuable. Inmon is strong on theory, while Kimball is strong on the day-to-day tools required to build a useful and understandable decision support environment. This reviewer has observed many huge "pure Inmon" projects fail (and none observed succeeded) while all pure Kimball projects observed have thrived and exceeded expectations. Strong words, to be sure, but accurate to the reality observed by this reviewer. While the failures might be attributed to the particular technical teams and management groups, many of those same groups did succeed with a dimensional approach.
If you want to debate academics, read both authors and have at it. I have to actually deliver... on time or early and on budget or below. Thus, my rating reflects that reality. Give me Kimball. The techniques have worked for me (and teams of all sizes) since 1986 at many Fortune 100 and 500 clients.

Inmon provides a great perspective on the issues and on how to think about information management for decision support and this book provides some of his latest thinking on the topic. Kimball provides an excellent and proven way to get there in the real world."

here
http://www.amazon.com/Building-the-Data-Warehouse/dp/B000FQJ88E

If you have real concerns like the type you outlined, where would you say I can go for reliable advice on "technical bits" and issues relating to performance?
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2007-03-20 : 10:58:07
There's no substitute for experience.
You can follow Kimball and you will probably end up with a workable system.
It probably won't be as efficient as it could be but it will probably be easily understood which counts for a lot.
This system is built around usability rather than performance so compromises are made in data representation so that the users can access the data.
That's not a bad thing because the users will be able to tell when things are wrong and they will be able to see things progressing rather than just being presented with a final result.
It's a lot easier to go to meetings and say you have found problems with the data (it always happens) if the users can view and understand that data.

From a design point of view Kimball makes me cringe - from an implementation process point of view it has a lot to offer.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-20 : 11:03:19
And Inmon doesn't make you cringe?
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-03-20 : 14:13:02
Yes, I agree with that statement. Kimball sees a data warehouse as a collection of disparate datamarts. Inmon sees a data warehouse as a cohesive set of data from which datamarts can be spawned.
I'd go further to say that while Bill Inmon promotes data warehouses, Ralph Kimbal promotes books and seminars.
Problem is, Kimbal sells his books and seminars by claiming that data warehouseing can be accomplished using simple star schemas. The idea is appealing to executive decision makers who are not familiar with database design because it appears easy and cheap. Unfortunately, it is not scalable to a true data warehouse without incurring massive administrative and data integrity issues. When the datamart is finally rolled out there is inevitable disappointment in its limited functionality, and the corporate data warehousing initiative fades away from disinterest and shifting priorities.
Truth is, Inmon's approach requires a higher level of database skills (which you should have in your organization anyway), but his iterative approach is not expensive to implement.

e4 d5 xd5 Nf6
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-03-20 : 14:16:38
By the way, statements like:
"This reviewer has observed many huge "pure Inmon" projects fail" would surely piss off Inmon. If your DW project starts out as huge, you are NOT following Inmon's method. You should start off with just a few tables and implement policies and methods that allow you to prune, enhance, and modify the design over time (such as restricting access to views and sprocs). It is these policies that determine the success of your DW project, and not the database design itself.
The reviewer you quoted does not know what he is talking about.

e4 d5 xd5 Nf6
Go to Top of Page

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-20 : 17:46:55
I'm all for "higher level of database skills".

I see statements like these quite often "Kimball sees a data warehouse as a collection of disparate datamarts. Inmon sees a data warehouse as a cohesive set of data from which datamarts can be spawned."
But I don't quite know what it means it real terms to be quite honest. I'll have to read up on that a bit more to understand that statement properly.

And if what you say is accurate:
"I'd go further to say that while Bill Inmon promotes data warehouses, Ralph Kimbal promotes books and seminars."
I find that quite troubling. I've already invested some time and money down the Kimball route, and I'm going to be damn annoyed if I've been wasting my time.


Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-03-20 : 20:08:30
A data wharehouse is the result of integrating distinct/disparate data sources via redundancy or architecture in the hope of extracting further information.

The common "disappointment" with data wharehouse projects is that people assume that the information value of the data warehouse vastly exceeds the sum of its parts.

DavidM

Production is just another testing cycle
Go to Top of Page

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-21 : 05:05:57
I can see how that can happen. Thankfully, I have not been "overpromising". I have begun this project with the basic aims of:


  • offloading reporting from the OLTP server

  • to enable reports to run faster

  • to replace "overnight data summarizing" which currently takes place in VBA and gets displayed in Excel (you'll be amazed at what they did here)

  • to enhance the way we show how the client is meeting up against it's targets/goals



That's to start with anyway. I've not spread a "dreamy idea" that somehow when it's all done the DataWarehouse will just out of the blue say "you'll never guess what I found out for you today". Managers have to have a theory. I will then enable them to look at the data in order for them to prove or disprove that theory. On each occasion, if the DataWarehouse can't do that, we will decide whether or not to start collecting and collating that data.

It's not a magic crystal ball.
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2007-03-21 : 06:45:03
Sounds to me like you are building a reporting server rather than a datawarehouse.
For your company they could be the same thing but it would be usual to call it a reporting server.

Hope you aren't building star schemas and suchlike.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-03-21 : 06:49:41
The first 2 points are by-products of a datawarehouse. (availablity)

The third point is best described as controlled redundancy (pre aggregated storage).

The fourth point is the end game (analysis).

Storage + Analysis = Data Warehouse -> Availablity

DavidM

Production is just another testing cycle
Go to Top of Page

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-21 : 07:56:21
NR, I'm not snowflaking if that's what you're worried about. But I do partial snowflaking when required. I am currently following the Kimball approach. If that means "star schemas and suchlike" then my answer is yes - until someone shows me a better way.

Sounds like NR & Byrmol are not in agreement here. One says I'm building a Reporting Server. The other agrees I'm building a Data Warehouse.

NR, if what I am doing is a Reporting Server, how would you define a Data Warehouse? How does a Data Warehouse's purpose differ from the things I need to accomplish as listed earlier?

I listed these things as examples. It's not an exausitive list. I'm not building the Data Warehouse around reports. I've built the Logical Model first based on business requirements. That then turned into my Data Model which I am currently turning into my Physical Model.
Go to Top of Page

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-03-21 : 08:14:23
>>Sounds like NR & Byrmol are not in agreement here.

I don't think so. Without multiple data sources, it's not really a data warehouse.

A reporting server is usually not built around OLAP schemas, but against replicas of the OLTP schemas. Primarily to reduce read stress against high transaction system.

DavidM

Production is just another testing cycle
Go to Top of Page

coolerbob
Aged Yak Warrior

841 Posts

Posted - 2007-03-21 : 08:19:28
I am pulling data from 6 different databases. 4 are 3rd party. But while pulling data from multiple data sources is a symptom of Data Warehouses, I don't think it's proof in itself of a Data Warehouse existing...

I will be building a Reporting Server in the future as well. But that wont get populated by a nightly ETL. I'll be using disk-level replication using a product called DoubleTake.

Also, why have thousands of companies stuck with the Kimball route for so long if it is such a terrible idea? I'm not saying it's not, I just want to understand this issue properly.
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2007-03-21 : 10:19:47
I consider a datawarehouse something that is built to contain enterprise data. You sounded like you were building something to support a reporting requirement - that could be the whole of your enterprise which is why I said they could be the same thing for your company.

>> Also, why have thousands of companies stuck with the Kimball route for so long if it is such a terrible idea?
Why did MS take over the world when their software was so poor.
Why did VHS beat Betamax.

Just because a product is not the best doesn't mean it won't be successful.
Especially in the software dev industry where most people have little experience or aptitude. Most methodologies attempt to give a set of steps for people to follow blindly - the purpose being to get some value out of inexperienced staff. If that methodolgy also comes with a set of buzwords and (possibly incorrect) statements for people quote and sound as though they know what they are doing then so much the better.
A lot of people assume that what they read in a book must be correct and so never bother to check it (even if they are capable of it).

I can remember a system built around a select statements holding locks just because that was what a book said. They couldn't understand why they were losing transactions and wouldn't believe me even when I demonstrated the book was wrong.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-03-21 : 13:46:54
quote:
Originally posted by coolerbob
  • to replace "overnight data summarizing" which currently takes place in VBA and gets displayed in Excel (you'll be amazed at what they did here)

  • No. We would not. Most of us have seen it all, and worse. Your situation is so common it has a nickname: "Excel Hell".

    "Why have thousands of companies stuck with the Kimball route for so long if it is such a terrible idea?" Because it is easy to implement and misleads people into thinking they have created something of great value to the company.

    e4 d5 xd5 Nf6
    Go to Top of Page

    coolerbob
    Aged Yak Warrior

    841 Posts

    Posted - 2007-03-21 : 15:38:32
    Blindman/NR,

    What do I stand to gain from going on this Kimball course next week?
    Which of the 50 Inmon books would you recommend I start with?
    Go to Top of Page
        Next Page

    - Advertisement -