Using Included Columns in SQL Server 2005

By Kathi Kellenberger on 14 January 2008 | Tags: Indexes

One of the new features found in SQL Server 2005 is the ability to add additional columns, called Included Columns, to a non-clustered index. This article will explain the advantages of using included columns and what impact they will have on your database.

Last month I wrote an article that explained, from a very basic level, how indexes work. I briefly mentioned using Included Columns to allow an index to exceed key size limitations. I decided to learn more about included columns and pass along what I learned to you.

Key Column Size Limitations

The key columns of an index may not exceed 16 columns or a key length of 900 bytes. Using more than 16 columns in an index seems excessive to me, but maybe there are some business or performance reasons to do this based on the application. The 900 byte limitation means that the sum of the maximum possible width of each of the columns in the index must not exceed 900 bytes. For example, an int column takes up 4 bytes. The number of bytes in a char or varchar column is equal to the number of characters the column can hold. The number of bytes in an nchar or nvarchar column is twice the number of characters. Just check Books Online if you are not sure about a particular data type and add up the widths of all the columns.

When using variable length columns, such as varchar and nvarchar, SQL Server doesn't always prevent you from creating an index that exceeds 900 bytes. Run this script to create a table and an index composed of two columns that could potentially contain a total of 1000 bytes:

CREATE TABLE IndexTest
    (Column1 varchar(500) NULL,
    Column2 varchar(500) NULL,
    Column3 char(1000) NULL,
    Column4 varchar(MAX) NULL)
GO

CREATE NONCLUSTERED INDEX [Index1] ON [dbo].[IndexTest] 
(
    [Column1],[Column2] 
)
GO

The index is created, but you will receive this warning:

Warning! The maximum key length is 900 bytes. The index 'Index1' has maximum length of 1000 bytes. For some combination of large values, the insert/update operation will fail.

At this point, the index has been created. Let's see what happens when we try to insert some data:

DECLARE @value varchar(500)

SET @value = replicate('-',500)

INSERT INTO IndexTest (Column1,Column2) VALUES (@Value,@Value)

This is the error message:

Msg 1946, Level 16, State 3, Line 5 Operation failed. The index entry of length 1000 bytes for the index 'Index1' exceeds the maximum length of 900 bytes.

A row can be successfully inserted if the data does not exceed 900 bytes.

DECLARE @value varchar(500)

SET @value = replicate('-',100)

INSERT INTO IndexTest (Column1,Column2) VALUES (@Value,@Value)

Now, what happens if we attempt to create an index when any insert will exceed the limit? This example attempts to create an index on a char(1000) column.

CREATE NONCLUSTERED INDEX [Index2] ON [dbo].[IndexTest] 
(
    [Column3]
)
GO

Msg 1944, Level 16, State 1, Line 1 Index 'Index2' was not created. This index has a key length of at least 1000 bytes. The maximum permissible key length is 900 bytes.

Large data types, text and varchar(max) for example, can not be used as index key columns at all.

CREATE NONCLUSTERED INDEX [Index3] ON [dbo].[IndexTest] 
(
    [Column4]
)
GO

Msg 1919, Level 16, State 1, Line 1 Column 'Column4' in table 'dbo.IndexTest' is of a type that is invalid for use as a key column in an index.

When a non-indexed column is required in the SELECT list, the actual table row must be accessed to retrieve the column. This is called a bookmark lookup, and it can be a very expensive operation on a large table. I have often seen bookmark lookups as the bottleneck on a poor performing query. So, depending on the queries in the application, it could be beneficial to have quite a few columns in the index to avoid bookmark lookups.

Avoiding bookmark lookups and table scans are two reasons you may have for wanting extra columns as part of the index. Another thing to consider is that indexes take up substantial disk space and must be updated whenever data changes are made to the table. Indexes must be finely tuned and not just arbitrarily created to avoid a negative impact on performance.

Using Included Columns

By adding columns that can not be added to the index key as Included Columns, you can overcome the index key size limitations. Included columns are not counted in the 16 column or 900 byte limits. You can also add columns of the new large value data types, such as varchar(max), as an Included Column. The Included Columns are not stored at the intermediate levels of the index, only at the leaf level. Even though they are not stored at the intermediate levels, Included Columns still improved the performance of several queries I tested when the column was used in the WHERE clause along with the first column in the index. Bookmark lookups are avoided when Included Columns are part of the SELECT list.

A column of any data type except for text, ntext and image can be an Included Columns. Computed columns can also be used. The column can not be dropped while it is an Included Column.

When I first heard about Included Columns, I thought that they would only be useful to create an index that exceeded the size limits. After learning more, I found out that they can be added to an index even if the size limits will not be exceeded. What is the advantage, then, of using Included Columns instead of just adding the column to the index key? Theoretically, you should end up with an index that takes up less space because the Included Columns are only stored at the leaf level and not at the intermediate levels. After experimenting with one table, I found that the decrease in size was not that great.

Just like adding an additional column to the index key, adding an Included Column causes the index to be rebuilt. The following example adds a column to the Sales.SalesOrderDetail table and then drops and recreates one of the indexes three different ways. By using the "sp_spaceused" stored procedure, we can see the size differences when the column is added as an Included Column or to the index key.

USE AdventureWorks
GO
--Add a new large column to the table
ALTER TABLE Sales.SalesOrderDetail
    ADD TestIndex char(500) NULL
GO

--Update the column with some data
UPDATE Sales.SalesOrderDetail
SET TestIndex = replicate(cast(SalesOrderID as varchar) +
    cast(SalesOrderDetailID as varchar) + 
    coalesce(CarrierTrackingNumber,''),5)
GO

--Drop the index on the ProductID
IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = 
OBJECT_ID(N'[Sales].[SalesOrderDetail]') AND name = N'IX_SalesOrderDetail_ProductID')
DROP INDEX [IX_SalesOrderDetail_ProductID] 
    ON [Sales].[SalesOrderDetail] WITH ( ONLINE = OFF )
GO
exec sp_spaceused 'Sales.SalesOrderDetail'
--Recreate the index
CREATE NONCLUSTERED INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] 
(
    [ProductID] 
)
GO
--Before the new column is added to the index
exec sp_spaceused 'Sales.SalesOrderDetail'
GO

--Drop the index
IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = 
OBJECT_ID(N'[Sales].[SalesOrderDetail]') AND name = N'IX_SalesOrderDetail_ProductID')
DROP INDEX [IX_SalesOrderDetail_ProductID] 
    ON [Sales].[SalesOrderDetail] WITH ( ONLINE = OFF )

GO
--Add the new column to the index
CREATE NONCLUSTERED INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] 
(
    [ProductID],
    [TestIndex] 
)
GO

exec sp_spaceused 'Sales.SalesOrderDetail'

--Drop the index
IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = 
OBJECT_ID(N'[Sales].[SalesOrderDetail]') AND name = N'IX_SalesOrderDetail_ProductID')
DROP INDEX [IX_SalesOrderDetail_ProductID] 
    ON [Sales].[SalesOrderDetail] WITH ( ONLINE = OFF )

GO

--Add the column as an included colum
CREATE NONCLUSTERED INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] 
(
    [ProductID] 
)INCLUDE ( [TestIndex])


exec sp_spaceused 'Sales.SalesOrderDetail'

GO
--Reverse all changes
IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = 
OBJECT_ID(N'[Sales].[SalesOrderDetail]') AND name = N'IX_SalesOrderDetail_ProductID')
DROP INDEX [IX_SalesOrderDetail_ProductID] 
    ON [Sales].[SalesOrderDetail] WITH ( ONLINE = OFF )

GO
ALTER TABLE Sales.SalesOrderDetail DROP COLUMN TestIndex
GO
CREATE NONCLUSTERED INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] 
(
    [ProductID] 
)

Description	Size of Indexes
Without IX_SalesOrderDetail_ProductID	6,288 KB
With IX_SalesOrderDetail_ProductID	8,112 KB
Adding 500 char column to Index	75,640 KB
Adding 500 char column to Included Columns	71,184 KB

The index size increases by 67,528 KB when adding the new column to the index key. By adding the column as an Included Column instead, the index increases by 63,072 KB. As you can see, there is not much difference when looking at just one index. If we are talking about hundreds of tables, you might be able to save quite a bit of space. Books Online recommends that the columns used in the WHERE clause should be part of the index key. Additional columns used to "cover the query", in other words, including the columns used in the SELECT list that are missing from the index, should be added as Included Columns. Again, randomly adding columns to indexes can negatively affect performance.

Conclusion

Included Columns is a new feature of SQL Server 2005 that allows you to overcome size limits on key columns in a non-clustered index. This is helpful to avoid bookmark lookups that can be very bad on performance when querying large tables. This is another tool to you can use to fine-tune indexes to make sure that your enterprise applications perform as well as possible.