That Index Thing - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

That Index Thing

Author

Topic

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-27 : 07:09:11

IN BOL:

quote:
CLUSTERED

Creates an object where the physical order of rows is the same as the indexed order of the rows, and the bottom (leaf) level of the clustered index contains the actual data rows. A table or view is allowed one clustered index at a time.

and

quote:
NONCLUSTERED

Creates an object that specifies the logical ordering of a table. With a nonclustered index, the physical order of the rows is independent of their indexed order. The leaf level of a nonclustered index contains index rows. Each index row contains the nonclustered key value and one or more row locators that point to the row that contains the value. If the table does not have a clustered index, the row locator is the row's disk address. If the table does have a clustered index, the row locator is the clustered index key for the row.

CREATE TABLE MyList (
UserID INT NOT NULL ,
CourseID INT NOT NULL )

A simple table that is used to assign a Course to a User. It'll be used in lots of joins. INSERTS are infrequent, references frequent.

Typical use might be

SELECT COUNT(*) as TotalAssigned
FROM Users U
INNER JOIN MyList M ON M.UserID = U.UserID
WHERE U.ClientID = @ClientID
AND M.CourseID = @CourseID

This seems like a classic case of CLUSTERED INDEX being best, but how much better is a clustered INDEX over a NONCLUSTERED INDEX? Are we talking small differences?

Sam

X002548
Not Just a Number

15586 Posts

Posted - 2003-10-27 : 11:25:44

I would think (and I should try that more often) that it's a matter of having to go to the data page or not....in your case..there's no need...it'll do an index seek....

Kinda like an index only table in Oracle...

MOO

Brett

8-)

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-27 : 11:32:05

Speaken zee English? You're saying it don't matter CLUSTERED or NONCLUSTERED? Either way doesn't matter?

Here's another question from the INDEX category:

CREATE INDEX MYINDEX1 ON dbo.MyTable (UserID)
CREATE INDEX MYINDEX2 ON dbo.MyTable (CourseID)

versus

CREATE INDEX MYINDEX ON dbo.MyTable (UserID, CourseID)

Would either of these create statements perform better for my application and why?

Sam

X002548
Not Just a Number

15586 Posts

Posted - 2003-10-27 : 11:41:27

I would have a composite index...

Which has a higher cardinality?

Brett

8-)

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-27 : 12:45:52

UserID has a higher cardinality.

Sam

X002548
Not Just a Number

15586 Posts

Posted - 2003-10-27 : 12:53:30

Then go with your composite you already wrote...

Brett

8-)

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-27 : 13:10:16

Ok, but using a composite like

create index myindex on mytable (UserID, CourseID)

Does a composite index have any drawbacks such as referencing only the CourseID in a Join eg:

SELECT CourseID, Count(*) as [Total Users Assigned To Course]

From MyList

GROUP BY CourseID

Sam

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-10-27 : 13:16:21

If you only use CourseID, it should do a Index scan. Maybe you could put your Clustered Index on (UserID, CourseID) and then put a index on CourseID as needed.

Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>

X002548
Not Just a Number

15586 Posts

Posted - 2003-10-27 : 13:26:37

You do know that you're not limited to 1 Index....just 1 clustered index....right?

Brett

8-)

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-27 : 13:33:03

Yes, but I'm still having a hard time understanding how to pick and choose. CLUSTERED, NONCLUSTERED, COMPOSITE.

So how much benefit is there in using CLUSTERED INDEX (UserID, CourseID) over COMPOSITE INDEX (UserID, CourseID)? Can the benefit be quantified in some unit that I understand like volts? </end humor>

Sam

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-10-27 : 13:39:00

Sam
How many rows will be in that table? If you are only going to have a few hundred, you'll probably never see the difference. If you have thousands, I think you'll see a big speed increase on selects. I think the more data you have, the more speed differece you'll see over a non-clustered index.

Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-27 : 13:59:40

This table is starting at 40,000 rows. It may grow to 400,000 rows within a year.

I re-read your post, you didn't explicity say CLUSTERED is faster than NON CLUSTERED, but that seems to be your implication.

How about COMPOSITE versus NON-COMPOSITE (NON-CLUSTERED) indicies?

Sam

homam
Starting Member

31 Posts

Posted - 2003-10-27 : 19:08:48

MichaelP:

The suggestion to create a clustered index on (UserID, CourseID) and then another index on CourseID isn't good because you'll have redundant information. Remember that if you have a clustered index, then all the other indexes on the table use the clustering keys, NOT the row IDs, to locate a record. For example, if you have a clustered index on (UserID, CourseID), and then create another index on CourseID, then all the entries of the CourseID non-clustered index will be mapped to records as follows:

NonclusteredIndex(CourseID)->ClusteringKey(UserID, CourseID)->ClusteredIndex(UserID, CourseID)->RowID

Therefore there's no point in having the CourseID in the clustered index because the non-clustered index will map as follows:

NonClusteredIndex(CourseID)->ClusteringKey(UserID)->ClusteredIndex(UserID)->RowID.

SamC:

Clustered indexes are most efficient for range queries and sorts based on the clustering keys because the data pages are sorted based on the clustering keys. However, keep in mind that clusterd indexes are more expensive when it comes updates/inserts/deletes. Also, keep the clustered index keys small because all the other nonclustered indexes will use the clustring keys to locate rows in the tabels. When you don't have a clustered index, non-clustered indexes will have row IDs in them to map directly to the table rather than using the clustered index as an intermediary to locate rows.

Whether to use composite indexes depends on whether you want to inforce primary key or unique constraints on the table based more than one column and for performance when the criteria utilize all the the columns in the composite index.

For example, if you have a composite index on columns A and B, and your queries only restrict on B, then you won't get much performance boost. But when your queries restrict on both A and B, then you'll get the maximum boost.

Composite indexes are also very efficient for Covered Queries. For example, if all the columns mentioned in your query (select list, where clause, join criteria, etc) are covered by one or more indexes, then SQL server can use index intersection to get the result set without touching the table proper. For example, if you have this query:

select A, B
from SomeTable
where C = 5 and D between 1 and 2

If you have two composite indexes, one on (A, B, D), and the other on C, then the above query is covered and SQL server uses index intersection to get the results very quickly without touching the table.

mr_mist
Grunnio

1870 Posts

Posted - 2003-10-28 : 03:52:44

Sam, we have some fairly large tables here, ranging from a few million to hundreds of millions of rows. The difference between a nonclustered and clustered index on these is fairly marked.

For a start, the optimizer will sometimes ignore the non-clustered indexes completely, even if from my eye it might look like the non clustered is a more useful index for the job. (For example in the case described where maybe there is a composite clustered index on user, course and a "normal" index on course, then a query that looks for a specific course would still scan the clustered index than seek on the nonclustered).

I tend to put the clustered index on whatever I find most logical or that will be most frequently referenced to extract data. Tbh it is also often the primary key.

-------
Moo. :)

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-28 : 07:49:23

quote:
Originally posted by homam

NonclusteredIndex(CourseID)->ClusteringKey(UserID, CourseID)->ClusteredIndex(UserID, CourseID)->RowID

Therefore there's no point in having the CourseID in the clustered index because the non-clustered index will map as follows:

NonClusteredIndex(CourseID)->ClusteringKey(UserID)->ClusteredIndex(UserID)->RowID.

The first example above makes sense and helps me understand the relationship between clustered and nonclustered indexes. I have no idea how the second example follows since CourseID could not locate a row using UserID.

quote:
Clustered indexes are most efficient for range queries and sorts based on the clustering keys because the data pages are sorted based on the clustering keys. However, keep in mind that clusterd indexes are more expensive when it comes updates/inserts/deletes.

This guideline suggests using clustered indexes on high cardinality data (datetime would be a good candidate) when it'll be used in a query, but too much in a clustered index (like including CourseID) can botch benefit of using

New question: If I drop an existing clustered table index, and replace it with a new nonclustered scheme, will the stored procedures referencing the table be recompiled automatically on next use?

Sam

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2003-10-28 : 07:59:10

Sam...is there a conflict between your 2 statements?

"INSERTS are infrequent, references frequent."
and
"This table is starting at 40,000 rows. It may grow to 400,000 rows within a year."

This growth rate...seems to suggest a fairly aggressive INSERT action which may have implications for the clustering of the data....and the performance of the database while this is happenning...(page splits, etc)

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-28 : 08:06:52

No, it makes sense (to me). Courses are assigned to users in "one fell swoop", usually a one time insert when a course goes active, or when a new client is populated in the database. The remainder of the time, the table is extensively looking up what users are assigned to a course to build various course completion reports.

Sam

MakeYourDaddyProud

184 Posts

Posted - 2003-10-28 : 09:23:39

quote:

For example, if you have a composite index on columns A and B, and your queries only restrict on B, then you won't get much performance boost. But when your queries restrict on both A and B, then you'll get the maximum boost.

Bit confused...

if index(a,b) and selecting on b wouldn't you lose the complete benefit of that index?? I take it you meant

quote:

For example, if you have a composite index on columns A and B, and your queries only restrict on A, then you won't get much performance boost. But when your queries restrict on both A and B, then you'll get the maximum boost.

Good writeup, but with that bit, I thought i was going silently insane, LOL

Daniel Small MIAP
www.danielsmall.com IT Factoring

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-28 : 10:00:45

Wanted to draw attention to my last question:

quote:
New question: If I drop an existing clustered table index, and replace it with a new nonclustered scheme, will the stored procedures referencing the table be recompiled automatically on next use?

Sam

Or do I have a bigger problem on my hands to get the new indexes and statistics properly functioning in the database?

Sam

X002548
Not Just a Number

15586 Posts

Posted - 2003-10-28 : 11:53:30

Gotta get a good book on this...The following takes a few to run, but I was suprised that in the last SELECT, the optimizer chooses the non clustered index...why would that be?


USE Northwind
GO

CREATE TABLE myTable99 (UserId int, CourseId int)
GO

SET NOCOUNT ON

DECLARE @x int, @y int, @z int
SELECT @x = 0, @y = 0, @z = 0
WHILE @x < 4000
	BEGIN
	WHILE @z < 5000
		BEGIN
		WHILE @y < 11
			BEGIN
				INSERT INTO myTable99 (UserId, CourseId)
				SELECT @z, @y
				SELECT @y = @y + 1
			END 				
			SELECT @y = 0, @z = @z + 1
		END
		SELECT @x = @x + 1
	END
GO

SET NOCOUNT OFF

CREATE UNIQUE CLUSTERED INDEX IX1 ON myTable99 (UserId, CourseId)
GO
CREATE INDEX IX2 ON myTable99 (CourseId, UserId)
GO

SELECT * FROM myTable99 WHERE UserId = 255
SELECT * FROM myTable99 WHERE CourseId = 5
SELECT * FROM myTable99 WHERE CourseId = 5 AND UserId = 255
SELECT * FROM myTable99 WHERE UserId = 255 AND CourseId = 5
GO

DROP TABLE myTable99
GO

Brett

8-)

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-28 : 12:08:38

Ya know, ya gotta admire running a test to cut through the <explicatave>.

I really hope somone can explain these results. Pretty clearly looks like NONCLUSTERED INDEX is taking priority over CLUSTERED in the last query.

quote:
Remember that if you have a clustered index, then all the other indexes on the table use the clustering keys, NOT the row IDs, to locate a record.

That quote made a lot of sense to me, but I can't tell from the execution plan if it's true or not.

Sam

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources