Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Help improve my ugly query

Author  Topic 

grantm
Starting Member

2 Posts

Posted - 2007-10-13 : 10:26:39
Hi,

I have the following query which i'm certain there's a better way of doing.
Any suggestions would be greatly appreciated.
This query has grown as more functionality was needed from the system.

What this query does, is allow me to filter my table for 2 fields.
If i specify type 0, all records are returned. If i specify type 1, all records of type 1 are returned.
The same logic is then nested inside if I specify a platform as 0 or a value.


IF(@Type = 0)
BEGIN
IF(@Platform = 0)
BEGIN
SELECT * FROM Tournament T WHERE T.IsActive = 1
END
ELSE
BEGIN
SELECT * FROM Tournament T WHERE T.IsActive = 1 AND T.Platform = @Platform
END
END
ELSE
BEGIN
IF(@Platform = 0)
BEGIN
SELECT * FROM Tournament T WHERE T.IsActive = 1 AND T.Type = @Type
END
ELSE
BEGIN
SELECT * FROM Tournament T WHERE T.IsActive = 1 AND T.Type = @Type AND T.Platform = @Platform
END
END


Idealy i'd like to build this query up in a manner like follows:


SELECT * FROM TOurnament T WHERE T.IsActive = 1
if(@Platform <> 0)
BEGIN
AND T.Platform = @Platform
END
if(@Type <> 0)
BEGIN
AND T.Type = @Type
END


Is this possible?
Or is there an alternative way i should be setting this up?

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2007-10-13 : 10:30:02
[code]SELECT *
FROM Tournament T
WHERE (
@Platform = 0
OR T.Platform = @Platform
)
AND (
@Type = 0
OR T.Type = @Type
)[/code]


KH
[spoiler]Time is always against us[/spoiler]

Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-13 : 10:32:40
might be some issues relating to performance to consider, but if you have a small database you won't notice them!

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-10-13 : 13:11:20
quote:
Originally posted by khtan

SELECT * 
FROM Tournament T
WHERE (
@Platform = 0
OR T.Platform = @Platform
)
AND (
@Type = 0
OR T.Type = @Type
)
AND T.IsActive = 1



KH
[spoiler]Time is always against us[/spoiler]






E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

evilDBA
Posting Yak Master

155 Posts

Posted - 2007-10-13 : 16:35:04
Wait, wait, wait! Dont change you query
Yes, it looks stupid for the programmers who come from C++ or C#. Like copy/paste programming. Almost the same query is repeated with some variations multiple times

So people tend to 'merge' different queries into the same one using ORs

DONT DO THAT!

There queries, even they look similar on the syntax level, might have absolutely different execution plans (for ex, the one based on a column=@value can use index while another can use a full table scan). So they must be kept differently in you stored proc as different cases!
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-14 : 00:54:51
Not sure that will do it as the query plan will be based on the first execution of the SProc, and whatever parameter is provided.

That route may require each SELECT to be replaced by an EXEC to an Sproc specifically for that case (which will, itself, have its own dedicated execution plan)

But this may all have got a whole lot better under SQL 2005, I haven't checked specifically.

It isn't going to make much odds though if there are only a few thousand rows in the table, and a handful of users

Kristen
Go to Top of Page

grantm
Starting Member

2 Posts

Posted - 2007-10-14 : 03:50:38
So now i'm more confused about what i should do.
Currently, there are NOT thousands of rows in my table, but one day there very well may be.
I would prefer to be prerared for that now then worry about it when it becomes a problem.

I was just about to update my query until i read evil's response.
THis is currently what my query looks like:


CREATE PROCEDURE [GetTournamentsCreatedByUser]
@User int,
@UpcomingResults bit,
@Type int,
@Platform int
AS
BEGIN
SET NOCOUNT ON;

IF(@Type = 0)
BEGIN
IF(@Platform = 0)
BEGIN
IF(@UpcomingResults = 0)
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [User] = @User AND IsOrganiser = 1
END
ELSE
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [User] = @User AND IsOrganiser = 1 AND EventDate >= GetDate()
END
END
ELSE
BEGIN
IF(@UpcomingResults = 0)
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [Platform] = @Platform AND [User] = @User AND IsOrganiser = 1
END
ELSE
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [Platform] = @Platform AND [User] = @User AND IsOrganiser = 1 AND EventDate >= GetDate()
END
END
END
ELSE
BEGIN
IF(@Platform = 0)
BEGIN
IF(@UpcomingResults = 0)
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [User] = @User AND IsOrganiser = 1 AND T.[Type] = @Type
END
ELSE
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [User] = @User AND IsOrganiser = 1 AND EventDate >= GetDate() AND T.Type = @Type
END
END
ELSE
BEGIN
IF(@UpcomingResults = 0)
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [Platform] = @Platform AND [User] = @User AND IsOrganiser = 1 AND T.[Type] = @Type
END
ELSE
BEGIN
SELECT * FROM Tournament T
INNER JOIN TournamentUser TU on T.Id = TU.Tournament
WHERE [Platform] = @Platform AND [User] = @User AND IsOrganiser = 1 AND EventDate >= GetDate() AND T.Type = @Type
END
END
END
END


SO am i correct in saying, this query will perform better than the suggested solution above of (@Type = 0 OR Type = @Type) AND (@Platform = 0 OR Platform = @Platform)?
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-14 : 05:23:28
"I would prefer to be prerared for that now then worry about it when it becomes a problem."

Good for you. Nightmare to solve then as the application will be falling over itself with its success, and it will be critical to fix it quickly, etc.

Plus I reckon that solving this particular one using "best practice" will be likely to set you up correctly for similar ones in the future.

There isn't a "right" answer though.

I don't like the code the way you have it because of the risk that during maintenance not all of the conditions will be modified in sync. and something will get overlooked by the programmer.

I prefer khTan's query because I can see at a glance what the various things it does are.

We use khTan's style of "if parameter provided select based on it, otherwise treat it as a wildcard" query all over the place in our application. We have tens-of-thousands of rows and hundreds-of-users.

However, we do have queries which are more carefully coded!! (i.e. giving priority to better performance, rather than ease of maintainability) and these are generally for queries which are used often, are on very large tables (hundreds-of-thousands of rows and up), or where queries perform badly.

In those cases we don't adopt any sort of best-practice; we create relevant test data and try out various scenarios until we find the one that performs the best. It might just be a case of adding an appropriate index, or we might do something more exotic.

For the "more exotic" ones I have seen a number of approaches, but the route we generally use is to put PK IDs into a Temp Table matching one search criteria (starting with the one that yields the fewest results), and then filter based on that (I'll provide an example of what I am talking about lower down).

Another way of tackling this is using Dynamic SQL. Using string concatenation you can concatenate ONLY the relevant WHERE clause phrases. Used correctly the resultant query plan will be cached. So assuming that there are some "popular" combination of criteria, they will perform well, whereas rarely used combinations will take longer [the first time] as the query plan has to be built.

There are a number of downsides to dynamic SQL:

Application must have SELECT permissions to the tables (rather than just EXECUTE permission on the SProc)
The SQL generated is harder to debug
Risk of SQL Injection if you are not careful
And a few others probably.

Rather than use your tables I'll use a Name / ZipCode / phone number example (because I have the code already and I can just Cut&Paste ).

Lets say we want a Name Search function that lets users search on any/all of Name, ZipCode and Telephone Number.

The telephone number is our "best" match - only likely to yield one, or possibly a few, matches; ZipCode is next, and finally Name (because lets assume it will use a wild-card match, which is slow, and searching for "%smith%" will give multiple matches).

So we can do:

IF @Phone IS NOT NULL
BEGIN
... Get PKs matching @Phone into #TEMP
END
ELSE
IF @ZipCode IS NOT NULL
BEGIN
... Get PKs matching @ZipCode into #TEMP
END
ELSE
BEGIN
... Get PKs for records with Name LIKE '%' + @Name + '%' into #TEMP
END

... then ...

SELECT Col1, Col2, ...
FROM #TEMP AS T
JOIN MyAddressTable AS A
ON A.MyPK = T.MyPK
WHERE
(@Phone IS NULL OR MyPhone = @Phone)
AND (@ZipCode IS NULL OR MyZipCode = @ZipCode)
AND (@Name IS NULL OR Name LIKE '%' + @Name + '%')
ORDER BY ...

So we have used the "best" criteria [amongst the ones the user has provided] to make a first-cut of the data, and then the second-cut uses the short-list from the first-cut combined with ALL the criteria. This fulfils my requirement for reducing the risk of someone goofing during maintenance!

We can do the same thing using Dynamic SQL
DECLARE @strSQL varchar(8000)

SELECT @srtSQL = 'SELECT Col1, Col2, ... '
SELECT @srtSQL = @srtSQL + 'FROM MyAddressTable '
SELECT @srtSQL = @srtSQL + 'WHERE 1=1 '
IF @Phone IS NOT NULL THEN SELECT @srtSQL = @srtSQL + 'AND MyPhone = @Phone '
IF @ZipCode IS NOT NULL THEN SELECT @srtSQL = @srtSQL + 'AND MyZipCode = @ZipCode '
IF @ZipCode IS NOT NULL THEN SELECT @srtSQL = @srtSQL + 'AND Name LIKE '%' + @Name + '%'
SELECT @srtSQL = @srtSQL + 'ORDER BY ...'

PRINT @srtSQL -- Debug only!

EXEC sp_ExecuteSQL @strSQL,
N'@Phone varchar(50), @ZipCode varchar(10), @Name varchar(50)',
@ZipCode = @ZipCode,
@Phone = @Phone,
@Name = @Name

and example of the code executed would be something like this:

sp_ExecuteSQL N'SELECT Col1, Col2, ...
FROM MyAddressTable
WHERE 1=1
AND MyZipCode = @ZipCode
ORDER BY ...
',
'@Phone varchar(50), @ZipCode varchar(10), @Name varchar(50)',
@ZipCode = '1234',
@Phone = '',
@Name = ''
[/code]
notice that in this example only the ZipCode is used in the WHERE clause, but all the @parameters are provided (they don't have to be, its easier to code that way, and it makes no difference having redundant parameters)

The Query Optimiser will cache the query for ZipCode (Note that the actual ZipCode does NOT appear in the WHERE clause, it is provided as the parameter @ZipCode instead, and therefore ALL ZipCode queries will send this EXACT same syntax, and thus match the cached query plan, which will be reused.)

If someone else searches for Name, or Phone, that won't (yet) be in the cached query plan, so a new query plan will be built , and cached. Thus popular variations of this query will all be cached, and the unpopular ones - Tough! - the user will have to wait a bit longer to get their results.

BUT ... because the WHERE clause is now so "slim" there is a much better chance that Indexes get used and so on, so even for "unpopular" queries the results are still likely to be way faster than the "wildcard SProc" discussed earlier.

So if you are still awake! what should you do?

I would not use it the way you have it. Too much risk of changes during maintenance not been synchronized amongst all the variations (and there may be more variations added in the future, and even if the programmer makes no mistakes it will be time consuming to add new combinations.)

I would generate test data to roughly the volume you expect the application to have and test the query.

The BEST thing you can do to improve performance is to have appropriate indexes for your queries.

That is possibly:

EventDate, User, Platform, IsOrganiser, Type

but given that some of these may not be provided you may need some other combinations of keys.

Note that the first key in an index needs to be highly selective (i.e. the number of instances for a specific value needs to be low - ONE = a unique index). SO its no use whatsoever putting IsOrganiser at the front of the index!

OTOH if EventDate is the first key, and it isn't used in a particular query, that index won't be used. Hence you may also need an index of

User, Platform, IsOrganiser, Type

If you are not familiar with how to check the "performance" of a query please ask.

Sorry for lengthy response, but hopefully you are at a stage where you will find it useful.

Kristen
Go to Top of Page

evilDBA
Posting Yak Master

155 Posts

Posted - 2007-10-14 : 15:20:05
Wonderful explanation, Kristen!

I will try to give you a shorter answer.
If column X is not indexed and is not planned to be indexed, the it is ok to merge different cases using (@X = 0 OR X = @X)
Otherwise, you should code all cases explicitly
If there are too many cases (which is possible as growth is exponential) then you can use dynamic SQL.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-14 : 15:31:09
"Wonderful explanation"

Thanks

"Otherwise, you should code all cases explicitly"

I would only follow that route for large tables / many users / very high usage of this process.

Does the

(@param IS NULL OR MyColumn = @Param)
AND (... more like that ...)

style solution always avoid indexes? Or is the problem more that it is liable to cache the first example that it encounters [which may be a "minority flavour", and thus create a sub-optimal cached query plan for majority]?

Kristen
Go to Top of Page
   

- Advertisement -