Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Clustered indices Rule!

Author  Topic 

TimChenAllen
Starting Member

45 Posts

Posted - 2003-11-26 : 12:46:09
I'm working an OLAP project where a lot of tables have been read over from their original source to a data warehouse. These tables in general have been read over without their primary keys. No tables were created with clustered indices (a table with a primary key gets a clustered index on that PK by default if no other clustered index is created).

Analysis Services shows you the SQL statements it uses when it creates cubes. I took each SQL statement and ran it through the "Display Estimated Execution Plan" tool in SQL Query Analyzer. I saw that all of the SQL created caused table scans.

I created clustered indices on the tables involved with each cube process. Clustered indices seemed to be the answer, because the SQL for a cube process is generally like:

select (lots of columns)
from fact1, dim1, dim2, dim3
where fact1.d1 = dim1.d1 and fact2.d2 = dim2.d2 and fact3.d3 = dim3.d3

This results in selects which are low-selectivity. So non-clustered indices (which are used at 10% selectivity or lower, or 10%-15% sometimes) won't ever be used. I had to remember that a table can only have one clustered index (because a clustered index is really a re-ordering of the rows in the table-- can't do that more than once).

I used this syntax:

-- First, I tested the tables for uniqueness of keys
select d1, count(*) from dim1 group by d1 having count(*) > 1
-- This should come back with no rows

-- Then I created the index
create unique clustered index ix_dim1_d1 on dim1(d1)

The result? No table scans at all. And a drop in cube processing time, in one case, from 20 minutes to 3. Pretty good! I recently read an article that says that all tables should have a clustered index. I am now a believer!

--
TimothyAllen
[url]http://www.timallen.org[/url]

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-11-26 : 12:50:34
Otherwise it's a heap.

The data rows are not stored in any particular order, and there is no particular order to the sequence of the data pages. The data pages are not linked in a linked list.

Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-11-26 : 14:21:40
>> all tables should have a clustered index
Not true as it can slow down inserts/updates.

Closer to being true is that all tables should have a unique index to preserve integrity - but again this could be maintained by the interface if you dare.

Not having any indexes on any tables (as you started out with) is very unlikely to be a best practise and shows the problem with developing on small amounts of data. Had you had a reasonable amount the processes taking a few days would have made the problem apparent.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

TimChenAllen
Starting Member

45 Posts

Posted - 2003-11-26 : 18:10:09
quote:
Originally posted by nr

>> all tables should have a clustered index
Not true as it can slow down inserts/updates.

Closer to being true is that all tables should have a unique index to preserve integrity - but again this could be maintained by the interface if you dare.


True, though in my case all the tables get written to only once, with no index in place, then the indices are created after, so no inserts/updates happen with indices in place. Good point, though.

--
TimothyAllen
[url]http://www.timallen.org[/url]
Go to Top of Page
   

- Advertisement -