Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Quick Question on Efficiency

Author  Topic 

jess
Starting Member

17 Posts

Posted - 2006-12-14 : 10:53:23
Well the question is very long but the answer will probably be very quick...


I work with SQL Server 2000 and have taught myself what I need to know to get my work done here, so I don't always know the details such as which methods are more efficient than others.

For example, I have a stored procedure that generates a report. The user inputs several parameters and the output is basically a big list of clients that meet the requirements, and lots of other details about those clients.

I want to add another parameter to give the user the option of having the report return the results for:
1. Only kids (under 18 years old)
2. Only Adults (18 or over)
3. All clients (regardless of age) (This is what the report currently does)

The user will therefore enter "child", "adult", or "both" into this parameter.
My question is where is the most efficient place to put this in the code?

Some very simplified examples:
Say the code is 100 lines long total before this update.

Do I copy the code 3 times and put a parameter in the where clause of each to find the right group, and then place IF statements around the 3 sections? The total code would then be more like 300 lines and would be something like this:

IF [parameter] = 'child'
SELECT ..............
FROM ................
WHERE ...............AND <client qualifies as being a child>

ELSE IF [parameter] = 'adult'
SELECT ..............
FROM ................
WHERE ...............AND <client qualifies as being an adult>

ELSE IF [parameter] = 'both'
SELECT ..............
FROM ................
WHERE ...............


To check if the client qualifies as being a child or an adult, I use the datediff function to compare their date of birth (which is in one of our tables) to getdate().
Another question - if I were to use the idea above where I repeat the code section 3 times, is it better (efficiency-wise) to put the code that checks if the client qualifies as a child or an adult in the WHERE statement like the example I gave, or is it better to tack it on to one of the JOIN statements where I'm joining the table that holds their birthday?

Or instead of the example above, can I use a CASE statement in the WHERE clause instead and just keep 1 section of code so it would be closer to the original 100 lines of code and would be something like:

SELECT ..............
FROM ................
WHERE ...............
AND CASE
WHEN <parameter = child> THEN <client qualifies as being a child>
WHEN <parameter = adult> THEN <client qualifies as being an adult>
WHEN <parameter = both> THEN <no extra check needed>
END


Another option would be to keep the code the same as it used to be except have it return 1 extra value - the client's age. Then have the program on the front-end (Crystal Reports) only display results based on the age that was returned.

Are there other options on how to place this extra check into the report?

If you say the front-end solution is the most efficient, then can you please also tell me what your answer would be if that wasn't an option? It would help me tremendously to know which back-end methods are more efficient than other ones for the countless other reports and queries I work on.

Thanks!
Jess


JoeNak
Constraint Violating Yak Guru

292 Posts

Posted - 2006-12-14 : 11:26:17
I'd avoid the 3 select statement method for maintainability reasons. I'd also avoid letting the front end filter the data as that is usually less efficient (report generation time).

I'd go with a where clause. Something like...
WHERE @Param = 'Both'
OR (@Param = 'Adult' And <definition of adult>)
OR (@Param = 'Child' And <definition of child>)

Go to Top of Page

jess
Starting Member

17 Posts

Posted - 2006-12-14 : 12:43:54
AAaaahhhh yes, exactly what I was looking for!
Thanks a million JoeNak.


Does anyone have any other general efficiency pointers for me?

Things like case statements vs if statements...

Or does it matter if you place conditional statement in the JOIN clause vs the WHERE clause?

Say you're joining tables A and B on a field [id] and narrowing your results down by 3 other statements regarding data in table B. Is there a big efficiency difference between doing it like this:

SELECT......
FROM [table a] a
JOIN [table b] b
ON a.id = b.id
WHERE b.<condition1>
AND b.<condition2>
AND b.<condition3>

compared to this:

SELECT......
FROM [table a] a
JOIN [table b] b
ON a.id = b.id
AND b.<condition1>
AND b.<condition2>
AND b.<condition3>

just curious.
Thanks!
Jess
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-12-14 : 16:27:05
quote:

SELECT......
FROM [table a] a
JOIN [table b] b
ON a.id = b.id
WHERE b.<condition1>
AND b.<condition2>
AND b.<condition3>

compared to this:

SELECT......
FROM [table a] a
JOIN [table b] b
ON a.id = b.id
AND b.<condition1>
AND b.<condition2>
AND b.<condition3>



No, those two are equivalent. when it doubt, write them both out, one after another in the same SQL batch, in Query Analyzer, and then look at the execution plan and see how they compare.

For your other question, if Age is an indexed column, then it will be most efficient to do it like this:

create procedure getdata @type varchar(100)
as
declare @minAge int;
declare @maxAge int;

set @minAge = 0;
set @maxAge = 999;

if @type = 'child' set @maxAge = 17
if @type = 'adult' set @minAge = 18

select * from yourtable where Age between @minAge and @maxAge


That avoids any need for OR's in your WHERE clause, which do tend to decrease the efficiency of your SQL Statement, and it also keeps it very simple and only requires 1 SELECT. Sometimes you need to step back and approach the problem from a different direction.

Hope this helps ...

- Jeff
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-14 : 16:34:25
Backend is in general more efficient than frontend, and the database is usually way faster than any application layer (except when you are dealing with presentation issues like date-format and such).

I'd probably do something like this (pseudo-code):
IF Child
Startdate = 1900
Enddate = 1988
ELSE IF Adult
Startdate = 1988
Enddate = 2100
ELSE IF All
Startdate = 1900
Enddate = 2100

SELECT
...
WHERE BirthDate BETWEEN StartDate AND EndDate
Should give excellent query caching and with an index on birthdate it would probably be quite fast as well.

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-14 : 16:35:54
Oh...jeff beat me to it! I'm just a slow typer

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-12-14 : 16:39:41
quote:
Originally posted by Lumbago

Oh...jeff beat me to it! I'm just a slow typer

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"



You've got to learn to be a more efficient typist !!!

- Jeff
Go to Top of Page

jess
Starting Member

17 Posts

Posted - 2006-12-15 : 12:47:33
Thank you for the additional suggestions. I certainly do like to see solutions coming from different angles.

Jeff -- I tried your suggestion with the minAge and maxAge and the results seemed to be the same as when using JoeNak's. (both ran the query in 1 second...)
So I tried to expand the query to be a little more complicated and in this case Joe's suggestion consistently runs at 5 seconds and the minAge & maxAge solution runs at 6 seconds (according to the clock in Query Analyzer)

Here's what I got when running a trace in SQL Profiler while executing each of the 2 methods twice:

Joe-1
cpu 5641
reads 189089
duration 4733

Jeff-1
cpu 5219
reads 190179
duration 6500

Joe-2
cpu 5734
reads 189078
duration 4796

Jeff-2
cpu 5204
reads 190148
duration 6436


Just thought it was interesting.

Thanks again for all of the great responses!
Jess
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-12-15 : 13:44:32
Is Age an indexed column?

- Jeff
Go to Top of Page

jess
Starting Member

17 Posts

Posted - 2006-12-15 : 15:03:20
Actually age is calculated in the stored procedure using
datediff(yy, p.dob, getdate())

I believe dob (date of birth) that is used in the calculation is an indexed field in the table.
Go to Top of Page

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2006-12-15 : 15:39:33
Again, same concept: you will be best off calculating the range of date values and then filtering on that. Filtering on a calculated value cannot make use of any indexes, since the calculation must be performed on the entire table to determine which results are accepted by the WHERE clause.

So, I recommend to try calculating the minDOB and maxDOB and filter on that DOB column directly, then see how efficiently that works.



- Jeff
Go to Top of Page

jess
Starting Member

17 Posts

Posted - 2006-12-15 : 17:13:17
OK, so I tried what you said and calculated the minDOB and maxDOB and placed that in the code. Here's the results I get from the trace from running both solutions twice each:

Joe-1
cpu: 22236
Reads: 186081
Duration: 16263

Jeff-1
cpu: 12125
Reads: 94999
Duration: 19716

Joe-2
cpu: 22064
Reads: 186015
Duration: 15593

Jeff-2
cpu: 12141
Reads: 94999
Duration: 19390



What I don't understand is why even tough it has less CPU usage and less READS, the solution from Jeff takes longer? (around 19 seconds instead of 16)

Any ideas?
Go to Top of Page
   

- Advertisement -