Trying to understand covering index

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Trying to understand covering index

Author

Topic

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-05-27 : 12:44:24

Let's say I have a non-clustered index on ColA, ColB, ColC, ColD, and ColE which is used to cover the following query:

SELECT ColA, ColB, ColC, ColD, ColE FROM SomeTable
WHERE ColA = @ValA AND ColB = @ValB AND ColC = @ValC

Can this same index be used to cover the following queries so that I don't have to create additional indexes for ColB and ColD to speed up these queries?

(1) SELECT ColB, ColE FROM SomeTable WHERE ColB = @ValB

(2) SELECT ColD, ColE FROM SomeTable WHERE ColD = @ValD

And BTW, if the index on ColA, ColB, ColC, ColD, and ColE is unique and is the only index I need, can I make it a clustered index? I'm wondering if a covering index can also be created as a clustered index instead of non-clustered which is the normal practice.

Thanks!

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-05-27 : 13:00:58

this will do

create nonclustered index ix_sometable on sometable (cola, colb, colc)
include (cold, cole)

for best performance of a covering index (which includes all used columns in the query), use the columns used for filtering (where and JOIN) as key columns, and keep other columns included.

E 12°55'05.63"
N 56°04'39.26"

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-05-27 : 13:55:18

Can that index ix_sometable also be used to cover the query, SELECT ColB, ColE FROM SomeTable WHERE ColB = @ValB, or do I have to create another covering index for this query? I just want to avoid creating unnecessary indexes.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-05-27 : 13:58:14

Yes. All these columns exists (and in the right places) of the existing index.
And that's what it all is about, to cover all column, aka covering index.

E 12°55'05.63"
N 56°04'39.26"

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-05-27 : 17:46:41

How is a compound index such as ColA, ColB, ColC structured? Does it have (a) 3 separate subindexes, one for each column, or (b) only one index based on the 3 columns combined? If (a), I can see how it could quickly search any of the columns, e.g., WHERE ColB = @ValB, since each column is sorted individually. But if (b), I can't see how it could do a quick binary search on any of the columns except the first one. For example, let's say we have the following values for ColA, ColB, and ColC in the compound sorted key:

A1 B1 C1
A1 B1 C2
A1 B2 C1
A1 B2 C2
A2 B1 C1
A2 B1 C2
A2 B2 C1
A2 B2 C2

As you can see, only the "A" values are sorted, the "B" and "C" values are not. Therefore, a binary search could only be done on the "A" values. For the "B" and "C" values, a sequential search will have to be done.

Any thoughts?

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-05-28 : 17:04:14

Bump. Did I fail to present my question clearly or is it just delving too deep into the innards of SQL Server that no one knows the answer?

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2009-05-28 : 18:00:15

Nah, not that hard. The answer to your question is (b)
http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/

So, with the index on A,B,C any of the following queries will bet seek operations
WHERE ColA = @A and Colb = @B and ColC = @C
WHERE ColA = @A and Colb = @B
WHERE ColA = @A

Where ColB = @B will get an index scan, not a seek. That said, the index is still technically covering. (it has in it all the columns used by the query).

If you still have questions afterward reading that blog post, please ask.

--
Gail Shaw
SQL Server MVP

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-05-28 : 20:03:04

Great reply, exactly what I needed. Thank you so much! I knew it wasn't that hard.

I just have a question regarding Scenario 4 in the blog (Equality match on the Date and String Columns). In that scenario, SQL is forced to do a scan since the leading column of the index is not used in the WHERE clause. You said, "In fact, SQL decides to do a table scan and evaluate each of the rows in the table against the values specified for the date and string columns." Is that a typo? Shouldn't SQL do an index scan instead since the Date and String columns are available in the index and indexes are generally much smaller than their associated tables (therefore faster to scan)?

Oh, and one more question. Suppose for the ColA, ColB, ColC index example that the WHERE clause used specifies the columns in a different order, e.g., WHERE ColB = @ValB AND ColC = @ValC AND ColA = @ValA. Will SQL be intelligent enough to know that this is equivalent to WHERE ColA = @ValA AND ColB = @ValB AND ColC = @ValC, and therefore do a seek instead of a scan? Does the order of the columns in the WHERE clause matter?

Thanks again!

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2009-05-28 : 21:41:21

quote:
I just have a question regarding Scenario 4 in the blog (Equality match on the Date and String Columns). In that scenario, SQL is forced to do a scan since the leading column of the index is not used in the WHERE clause. You said, "In fact, SQL decides to do a table scan and evaluate each of the rows in the table against the values specified for the date and string columns." Is that a typo? Shouldn't SQL do an index scan instead since the Date and String columns are available in the index and indexes are generally much smaller than their associated tables (therefore faster to scan)?

The execution plan specifically states that SQL Server is doing a table scan and in this case the index isn't any simpler than the table because the table and the index both have three columns. If the index is a covering index and the table has more columns that are not in the index then an index scan might be used instead.

quote:
Oh, and one more question. Suppose for the ColA, ColB, ColC index example that the WHERE clause used specifies the columns in a different order, e.g., WHERE ColB = @ValB AND ColC = @ValC AND ColA = @ValA. Will SQL be intelligent enough to know that this is equivalent to WHERE ColA = @ValA AND ColB = @ValB AND ColC = @ValC, and therefore do a seek instead of a scan? Does the order of the columns in the WHERE clause matter?

The order in the WHERE clause is not important, in fact the query optimizer is very smart about it and would even get it right if one of the columns appears in the WHERE clause and another in a JOIN for example. The expressions could also be written as @ValA = ColA and it will get it right.

What is important is way you state the expressions, because some expressions can be stated in a way that prevents index use.
There's a discussion about this here
http://weblogs.sqlteam.com/dang/archive/2009/03/07/Low-Hanging-Fruit-of-Sargable-Expressions.aspx
I would also recommend Kalen Delaney's Inside SQL Server (Query Tuning and Optimization) http://www.amazon.com/Inside-Microsoft%C2%AE-SQL-Server-2005/dp/0735621969/ref=pd_bxgy_b_text_b

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2009-05-29 : 04:26:48

quote:
Originally posted by mtl777 You said, "In fact, SQL decides to do a table scan and evaluate each of the rows in the table against the values specified for the date and string columns." Is that a typo? Shouldn't SQL do an index scan instead since the Date and String columns are available in the index and indexes are generally much smaller than their associated tables (therefore faster to scan)?

If you look at the screen shot right below where I said that, you'll see that the operation is indeed Table Scan, not Index Scan

quote:
Does the order of the columns in the WHERE clause matter?

It does not.
http://sqlinthewild.co.za/index.php/2007/07/29/order-of-execution/

--
Gail Shaw
SQL Server MVP

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-05-29 : 13:02:20

Thank you so much, snSQL and Gail!

Those articles on sargable expressions and order of execution are very informative and greatly appreciated.

I have more questions. For the following index,

CREATE NONCLUSTERED INDEX Ix_Covering ON SomeTable (ColA, ColB, ColC)
INCLUDE (ColD, ColE)

if you have a query using WHERE ColE = @ValE, would this query be covered by this index even though ColE is an included column only and not part of the key? I'm guessing yes, but it will have to do an index scan, right?

Lastly, let's say that the compound index key of Ix_Covering (ColA, ColB, ColC) is unique and can be made as the primary key. Suppose further that Ix_Covering is the only index that I need. Is it alright to create Ix_Covering as a CLUSTERED index? I'm wondering if a covering index can also be created as a clustered index instead of non-clustered which is the normal practice.

Thanks again!

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2009-05-29 : 14:10:03

quote:
Originally posted by mtl777
if you have a query using WHERE ColE = @ValE, would this query be covered by this index even though ColE is an included column only and not part of the key? I'm guessing yes, but it will have to do an index scan, right?

That would be my guess too, haven't tested it

quote:
I'm wondering if a covering index can also be created as a clustered index instead of non-clustered which is the normal practice.

The clustered index has at it's leaf level the actual data pages of the table. That's what makes a clustered index different from a nonclustered. That's why there can only be one per table. Hence, a clustered index implicitly includes all of the columns in the table and you cannot explicitly add include columns to it

--
Gail Shaw
SQL Server MVP

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-05-29 : 18:44:10

quote:
The clustered index has at it's leaf level the actual data pages of the table. That's what makes a clustered index different from a nonclustered. That's why there can only be one per table. Hence, a clustered index implicitly includes all of the columns in the table and you cannot explicitly add include columns to it

Oops, I forgot about that.

I've learned a lot from this thread. I really appreciate you and everyone taking the time to answer my questions. Thanks very much!

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2009-05-30 : 00:49:42

quote:
Is it alright to create Ix_Covering as a CLUSTERED index?

Given that the clustered index orders the physical rows in the table and that you only get one you should choose it carefully. Here are some things that can help you decide which index should be the clustered index:
1. It should have a small number of columns of a small type - a single integer column is often a good idea. This is because the clustered index key is used to look up rows for the nonclustered indexes, so you don't want to bloat the nonclustered indexes with big keys.
2. If you make the clustered index a incrementing number (using an identity column) you ensure that new rows are always added at the end of the table which prevents fragmentation and minimizes locking in other parts of the table.
3. If you need to optimize a set of range queries, for example querying rows by date range, using the clustered index on the range column will optimize those queries (probably only want to do this in specific cases).

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-05-30 : 01:53:51

I think Itzik showed that the clustered index does not "order" the records in the table according to the CI definition, due to an offset map in the CI page.

E 12°55'05.63"
N 56°04'39.26"

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2009-05-30 : 03:43:15

The clustered index defines the logical storage order of the index leaf pages. If there's no fragmentation, the physical storage order of the pages is the same as the logical order.
Rows on the page may be stored out of order, but that's what the slot index is there for.

--
Gail Shaw
SQL Server MVP

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2009-05-30 : 03:46:01

quote:
Originally posted by snSQL

quote:
Is it alright to create Ix_Covering as a CLUSTERED index?

Given that the clustered index orders the physical rows in the table and that you only get one you should choose it carefully. Here are some things that can help you decide which index should be the clustered index:
1. It should have a small number of columns of a small type - a single integer column is often a good idea. This is because the clustered index key is used to look up rows for the nonclustered indexes, so you don't want to bloat the nonclustered indexes with big keys.
2. If you make the clustered index a incrementing number (using an identity column) you ensure that new rows are always added at the end of the table which prevents fragmentation and minimizes locking in other parts of the table.
3. If you need to optimize a set of range queries, for example querying rows by date range, using the clustered index on the range column will optimize those queries (probably only want to do this in specific cases).

To add to that...
4) It's a good idea to make the clustered index unique. If it's not, SQL will add a hidden counter to make the index unique. Not essential, but something to keep in mind
5) Clustered index keys should be on columns that don't change. If the value of the clustering key changes for a row, that row has to be moved to a different page in the index, and all nonclustered indexes changed. That can be expensive.

--
Gail Shaw
SQL Server MVP

mtl777
Yak Posting Veteran

63 Posts

Posted - 2009-06-01 : 17:01:36

All good advice! Thanks!

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources