Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Which index approach is best?

Author  Topic 

R
Constraint Violating Yak Guru

328 Posts

Posted - 2010-09-09 : 11:38:35
I have a table with the following columns:

[ID] int Identity PK,
[uID] int FK,
[sID] int FK,
[gID] int FK,
[info1] nvarchar(max),
[info2] nvarchar(max),
[info3] nvarchar(150),
[info4] nvarchar(50),
[date1] datetime,
[date2] datetime

This table currently has about 16,000 rows, and in time will surpass ten million rows (at least). I need to perform lots of SELECT queries using the foreign key (integer) columns. I have created an index that contains all three columns, but wondered is this is the best approach? Basically I want to retrieve rows in the fastest possible time.

Should I have an index that contains all three columns, or three indexes each containing one column? I'm new to indexing but want to adopt best-practice from the start...!

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-09-09 : 11:48:46
It depends.
Can you post some examples of the queries that will run against this table?
Is the primary key the clustered index?

This series (3 parts) may be worth a read: http://www.sqlservercentral.com/articles/Indexing/68439/

--
Gail Shaw
SQL Server MVP
Go to Top of Page

R
Constraint Violating Yak Guru

328 Posts

Posted - 2010-09-10 : 03:31:59
Hi Gail

Yes, the ID column is the clustered index.

I believe all the queries executed on this table contain the WHERE clause...


WHERE
(uID = @uID) AND
(sID = @sID) AND
(gID = @gID)


Some stored procedures search for additional columns as well, e.g.


WHERE
(uID = @uID) AND
(sID = @sID) AND
(gID = @gID) AND
(info1 LIKE '%'+@info1+'%' OR @info1 IS NULL) AND
(date1 < @date1 OR @date1 IS NULL)
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-09-10 : 07:35:37
Then at minimum you need an index with the 3 foreign key columns as the first three key columns. If there's another column that's frequently included in the where clause then put that as the 4th column/

--
Gail Shaw
SQL Server MVP
Go to Top of Page

R
Constraint Violating Yak Guru

328 Posts

Posted - 2010-09-10 : 08:51:39
Okay thanks Gail.

Please tell me, does it make a difference whether there's one index containing all three columns, or having three indexes each containing one column?
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-09-10 : 10:46:14
A lot. Typically (not always but usually) SQL uses one index per table in a query. So an index with 3 columns it can use for a 3-column seek and get straight to the required records. Three single column indexes means SQL typically uses one of them, fetches rows that match one of the 3 conditions, then goes and looks up the rest of the columns and does a secondary filter to get rid of rows that don't match the other two conditions.

Did you read those articles I referenced? If not, please do, and these two as well
http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/
http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/

--
Gail Shaw
SQL Server MVP
Go to Top of Page

R
Constraint Violating Yak Guru

328 Posts

Posted - 2010-09-10 : 11:56:18
Okay great. Leave those links with me...
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2010-09-10 : 12:57:00
So..are you saying it dosn't use INDEX INTERSECTION anymore?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

namman
Constraint Violating Yak Guru

285 Posts

Posted - 2010-09-10 : 14:19:12

Yes, the ID column is the clustered index.

I believe all the queries executed on this table contain the WHERE clause...

WHERE
(uID = @uID) AND
(sID = @sID) AND
(gID = @gID)
Some stored procedures search for additional columns as well, e.g.

WHERE
(uID = @uID) AND
(sID = @sID) AND
(gID = @gID) AND
(info1 LIKE '%'+@info1+'%' OR @info1 IS NULL) AND
(date1 < @date1 OR @date1 IS NULL)



Then at minimum you need an index with the 3 foreign key columns as the first three key columns. If there's another column that's frequently included in the where clause then put that as the 4th column


For queries with condition as the red part, the index will be useless, doen't matter columns in the index. Index is not that simple, I think .
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-09-10 : 16:04:14
quote:
Originally posted by X002548

So..are you saying it dosn't use INDEX INTERSECTION anymore?



It can. Doesn't often but can. That's why I said 'typically' and 'usually'.

--
Gail Shaw
SQL Server MVP
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-09-10 : 16:05:16
quote:
Originally posted by namman

For queries with condition as the red part, the index will be useless, doen't matter columns in the index. Index is not that simple, I think .


Um, no. The index won't be optimal, but it certainly won't be useless. Why do you feel that any index will be useless for that where clause?

--
Gail Shaw
SQL Server MVP
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2010-09-10 : 20:43:23
quote:
Originally posted by GilaMonster

quote:
Originally posted by X002548

So..are you saying it dosn't use INDEX INTERSECTION anymore?



It can. Doesn't often but can. That's why I said 'typically' and 'usually'.

--
Gail Shaw
SQL Server MVP




Ummm...really...not for nothing..but if that's true..I am a heart broken crushed 15 YO girl in a dress looking for somebody asking me to dance

and if it is true....welcome to DB2

So Merry-go-round is A real MYTH



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

R
Constraint Violating Yak Guru

328 Posts

Posted - 2010-09-24 : 08:53:08
Thank you very much for posting those links. Although it might take a while for it all to sink through my thick skull, the articles were very insightful. I have a few questions about them if I may...?

quote:

Typically (not always but usually) SQL uses one index per table in a query.



What's the reason for this please? This obviously has a huge impact on the effectiveness of my indexes!


quote:

A lookup occurs when a nonclustered index was used to locate rows for a query, but the nonclustered index did not contain all of the columns required for the query.



Would an example of this be if an INNER JOIN was used, but the joined table column was not part of the index?


In another post on SQLTeam, I read that I shouldn't use query code that contains...

WHERE someCol = @val OR @val IS NULL

...even if someCol is contained within an index because SQL reverts to performing a scan. Is this correct? If so (and I apologise for my ignorance here), why hasn't SQL been developed to recognise if @val is null, and then execute the query more efficiently using a seek?


quote:

...there’s a single nonclustered index on all three columns, in the order ID, SomeDate, SomeString. If there’s an inequality predicate, then then the index is only fully seekable for the following queries

… WHERE ID = @ID AND SomeDate = @dt AND SomeString > @str
… WHERE ID = @ID AND SomeDate > @dt
… WHERE ID > @ID




If the first example above was written as

WHERE ID = @ID AND SomeString > @str AND SomeDate = @dt

(with second and third predicates swapped around) would SQL be smart enough to execute the query using a seek only, or is column order essential in WHERE predicates?
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-09-24 : 09:42:18
"In another post on SQLTeam, I read that I shouldn't use query code that contains...

WHERE someCol = @val OR @val IS NULL

...even if someCol is contained within an index because SQL reverts to performing a scan. Is this correct? If so (and I apologise for my ignorance here), why hasn't SQL been developed to recognise if @val is null, and then execute the query more efficiently using a seek?
"

I don't see why that is bad (needs parenthesis around it though, assuming it is followed by an AND (Col2 = @Param2 OR @Param2 IS NULL). I have never bothered too much about the consequences, although for huge queries we do work around it.

What I mean is you haven't really got a lot of choice if you have "optional" parameters from the user, so not much help someone telling you "Don't do it like that"

You could use separate queries / sprocs for each combination of parameters provided - but that is not usually realistic, although it IS a good approach if you can streamline the query by cherry-picking from the parameters you are provided with. e.g. if a Parameter is provided for a column that is unique, or has very high selectivity, much better to shoot for that first rather than some other parameter that uses MyColumn LIKE '%' + @Param2 + '%'.

We do things like:

IF @Param1 IS NOT NULL
INSERT INTO #TEMP (T_MyPK) SELECT MyPK FROM MyTable WHERE Col1 = @Param1
ELSE
IF @Param2 IS NOT NULL
INSERT INTO #TEMP (T_MyPK) SELECT MyPK FROM MyTable WHERE Col2 LIKE '%' + @Param2 + '%'
...
SELECT Col1, Col2, Col3, ...
FROM #TEMP
JOIN MyTable
ON T_MyPK = MyPK
WHERE (@Param1 IS NULL OR Col1 = @Param1)
AND (@Param2 IS NULL OR Col2 LIKE '%' + @Param2 + '%')

Note that the @Parm1 test in the final query is redundant, so only included here by way of demonstration.

The only other approach is dynamic SQL - which is probably a better bet because Query plans will be created for each variation of Parameters that are actually provided in real life, and popular ones will remain in the cache.

With the advent of Certificates (available in SQL2008, maybe SQl2005 too I've forgotten) users can still just have Execute permission on the Sproc, even though the Sproc is doing dynamic SQL.

DEFINE @strSQL nvarchar(4000)

SELECT @strSQL = 'SELECT Col1, Col2, Col3, ...
FROM MyTable
WHERE 1=1'

IF @Param1 IS NOT NULL
SELECT @strSQL = @strSQL + ' AND Col1 = @Param1'

IF @Param2 IS NOT NULL
SELECT @strSQL = @strSQL + ' AND Col2 LIKE ''%'' + @Param2 + ''%'''

EXEC sp_ExecuteSQL @strSQL,
N'@Param1 int, @Param2 varchar(100)',
@Param1 = @Param1,
@Param2 = @Param2

NOTE: @Param1 & @Param2 are provided to sp_ExecuteSQL regardless of whether they are included in the query, or not, sp_ExecuteSQL doesn't care (although you can leave them out if you can easily accommodate that in your logic / can be bothered!

sp_ExecuteSQL will create separate Query plans where @Param1, @Param2 and @Param1 + @Param2 are used - i.e. potentially 3 different query plans.

Note that @strSQL only includes named @ Parameters - it does NOT concatenate the contents of the actual @Parameter - which would introduce the risk of SQL Injection.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-09-24 : 09:43:43
"(with second and third predicates swapped around) would SQL be smart enough to execute the query using a seek only, or is column order essential in WHERE predicates?"

Yes the Query optimiser is smart enough. You don't need to worry about the order of your tests, or the order of criteria in a JOIN, etc.
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-09-24 : 12:10:22
quote:
Originally posted by R
quote:

Typically (not always but usually) SQL uses one index per table in a query.



What's the reason for this please? This obviously has a huge impact on the effectiveness of my indexes!


Another link: (inspired by this thread)
http://sqlinthewild.co.za/index.php/2010/09/14/one-wide-index-or-multiple-narrow-indexes/

It's most efficient to use one index, far less so to use multiple indexes and try to later combine them.

I didn't do a whole lot of tests for that blog post (short on time), but I did notice that it didn't take much for SQL to stop intersecting three indexes. A little wider row and it dropped to two indexes and a key lookup. Doubt it would have taken much more for it to go for a single index and key lookup or just a table scan

quote:

In another post on SQLTeam, I read that I shouldn't use query code that contains...

WHERE someCol = @val OR @val IS NULL

...even if someCol is contained within an index because SQL reverts to performing a scan. Is this correct?


Pretty much (and that would likely have been me saying so). http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

quote:
If so (and I apologise for my ignorance here), why hasn't SQL been developed to recognise if @val is null, and then execute the query more efficiently using a seek?


Just because the parameter is null on this execution doesn't mean it'll be null on the next one.

SQL caches its plans for reuse. Optimisation is expensive and it's something that you want to do as little as possible. SQL has one rule fundamental for plans - they MUST be safe for reuse. That is, if the plan is reused with a different set of parameters it must return the correct results.

You can see, in the case of the @var IS NULL or col = @var, SQL cannot do a seek on col = @var if @var is not null on the first execution because, if it does, the results will be wrong on a subsequent execution where @var IS NULL.

quote:

If the first example above was written as

WHERE ID = @ID AND SomeString > @str AND SomeDate = @dt

(with second and third predicates swapped around) would SQL be smart enough to execute the query using a seek only, or is column order essential in WHERE predicates?



http://sqlinthewild.co.za/index.php/2007/07/29/order-of-execution/ (old post and not up to my current standards, but it'll do)


--
Gail Shaw
SQL Server MVP
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2010-09-24 : 13:03:58
Maybe I missed it...BUT..

if you ever do...

Col LIKE '%'+@x+'%'

you will ALWAYS incur a scan, hence the index will be meaningless...unless of course it happens to do an index scan



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx





Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-09-24 : 14:34:14
Yeah, but there might be a test on another, indexed, column that narrows the effort.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2010-09-24 : 14:37:29
quote:
Originally posted by Kristen

Yeah, but there might be a test on another, indexed, column that narrows the effort.



...ummmm...I don't think so..got an example that I can test that will show me that?

Because that would be a miracle



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx





Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-09-25 : 03:28:35
Interesting. OK I'll try knocking up a test example.
Go to Top of Page
   

- Advertisement -