Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
SportsDave
Starting Member
1 Post |
Posted - 2006-08-18 : 04:35:22
|
Hi there,I have a large amount of sales data (approx 1 billion rows across about 10 fields) that I plan to build a data warehouse from. The data is currently split across 16 tables, one for each calendar quarter. My problem is that building the fact table currently takes in the region of 20 hours. I'm sure there must be some way to speed this up considerably but i'm not sure on the best approach.I have a compound index on the 6 fields I am grouping by (the dimension keys in the data warehouse) in each of the 16 tables. I have created a view from these by unioning them all together.I am concerned that the indexes are not being used in the view, but have no idea how to verify this.I have also looked at creating an index on the view but apparently I am unable to do this on views made from unioned tables. Can anybody give me any advice on this please. Is partitioning something I need to look at?Many thanks,Dave |
|
druer
Constraint Violating Yak Guru
314 Posts |
Posted - 2006-08-18 : 13:03:23
|
To check what the execution plan is for any query, simply use Query Analyzer and tell it to show the Execution Plan. Just go to Query/Show Execution Plan to make that happen. Once you turn that on, when you run any query it will show you exactly what is being done behind the scenes. Given your command takes 20 hours, you can do 1 of 2 things:1. Instead of saying "Showing Execution Plan" which occurs after the fact, you can do Query/Show Estimated Execution Plan which will take the best guess as to what the actual execution plan would be without having to actually run the query.2. You can just take your first select (rather than UNION ALL for all 16 tables) and see what the execution plan is for it. UNION ALL won't affect your execution plan for each of the individual queries. So if all 16 tables are identical, have the same indexes, and the command is the same all plans will likely be identical unless there are HUGE differences in the data itself. Without seeing the plan: I can tell you that unless your index covers ALL of the fields for the WHERE clause and the SELECT clause, if you are selecting a large portion of the data, then likely it is ignoring the indexes and just doing a table scan. If large portions of your table are going to get pulled anyway, the system generally doesn't like reading things in for an index, and then randomly going back to your data table to read the rest of the information (Bookmark Lookup.) Since I don't know what portion of your raw tables you are pulling, or whether or not your index covers all of the fields I can't tell you. But lets just say your index has 6 fields and they are all used for the WHERE clause stuff. In your select clause you ask for those 6 fields plus 2 more. You could dramatically speed things up then by adding those 2 additional fields to your index so that it can use the index to cover all of the fields.Hope it helps,DaltonBlessings aren't so much a matter of "if they come" but "are you noticing them." |
 |
|
|
|
|
|
|