columns returned affects query speed? - SQL Server Forums

Home | Weblogs | Forums

Site Sponsored By: SQLDSC - SQL Server Desired State Configuration

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

columns returned affects query speed?

Author

Topic

influent
Constraint Violating Yak Guru

367 Posts

Posted - 2011-05-04 : 18:01:34

I'm using the query below to see orders per hour in a reporting environment and I need to speed it up. I've noticed that if I only select HourNumber the query returns instantly, but selecting OrderCount as well slows it down. I don't get it. Aren't the joins performed whether I return a column from the joined table or not?? It only returns 72 rows of integer data, so it's not like it's taking time for Mgmt Studio to display it.

SELECT ht.HourNumber, ISNULL(t1.OrderCount,0) OrderCount
FROM HoursTable ht
LEFT HASH JOIN (SELECT DATEDIFF(HH, CONVERT(varchar(8), GETDATE(), 112), r.ProjectedDeliveryDate) HourNumber, COUNT(*) as OrderCount
FROM CurrentOrderStatus r
JOIN OrderProduct rp ON rp.OrderID = r.OrderID AND rp.ExpiredDate IS NULL
JOIN Product pr ON pr.ProductID = rp.ProductID AND pr.ProductID = 7
JOIN Profile p ON p.UserID = r.CustomerID
WHERE r.StatusID NOT IN (4,5)
AND r.ProjectedDeliveryDate > CONVERT(varchar(8), GETDATE(), 112)
GROUP BY DATEDIFF(HH, CONVERT(varchar(8), GETDATE(), 112), r.ProjectedDeliveryDate), pr.Name) t1 ON t1.HourNumber = ht.HourNumber

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-05-04 : 18:02:18

Show us the execution plan of both.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog

influent
Constraint Violating Yak Guru

367 Posts

Posted - 2011-05-04 : 19:03:13

It's too gigantic to show when the OrderCount is selected, but when it's not it's clearly not looking at the joined piece at all because it's only doing a table scan of HoursTable. I guess that makes sense since it's a left join. The subquery that makes up t1 is where the slowness is, I'll have to break that into pieces so I can track down which table. Thanks. I guess I don't give SQL Server enough credit sometimes.

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2011-05-04 : 19:46:56

With an outer join that does use any of the columns from the joined table, the SQL engine is smart enough to realize that it doesn't need to perform the join at all. The results are mathematically equivalent.

=======================================
Elitism is the slur directed at merit by mediocrity. -Sydney J. Harris, journalist (1917-1986)

influent
Constraint Violating Yak Guru

367 Posts

Posted - 2011-05-05 : 12:53:31

So I figured out that the query takes 3 seconds the first time its run and 0 seconds after that because of the cached query plan, but for whatever reason from Reporting Services it always takes 3 seconds. How can I force the query plan to stay cached?

jimf
Master Smack Fu Yak Hacker

2875 Posts

Posted - 2011-05-05 : 12:55:53

Make it a view or a stored procedure.

Jim

Everyday I learn something that somebody else already knew

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-05 : 12:58:25

quote:
but for whatever reason from Reporting Services it always takes 3 seconds

My guess is it's rendering and network time, especially if it's that consistent. Not much you can do there re: query tuning.

influent
Constraint Violating Yak Guru

367 Posts

Posted - 2011-05-05 : 13:59:14

I tried making it a stored procedure and it still takes 3 seconds every time. I'm seeing the 3 seconds from the EXEC in Profiler using the Duration template, so it's definitely not related to rendering and network time, though that was a good guess. I also tried OPTION USE PLAN but am getting this error, possibly because I have local vars inside the procedure:
Cannot execute query because of incorrectly formed XML plan in USE PLAN hint. Verify that XML plan is a legal plan suitable for plan forcing.

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-05 : 14:22:18

I made some changes:

DECLARE @today DATETIME
SET @today=DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0)  -- avoid implicit conversions

SELECT ht.HourNumber, ISNULL(t1.OrderCount,0) OrderCount
FROM HoursTable ht
LEFT JOIN (SELECT DATEDIFF(HH, @today, r.ProjectedDeliveryDate) HourNumber, COUNT(*) AS OrderCount
FROM CurrentOrderStatus r
JOIN OrderProduct rp ON rp.OrderID = r.OrderID AND rp.ExpiredDate IS NULL AND rp.ProductID = 7
JOIN Product pr ON pr.ProductID = rp.ProductID 
JOIN PROFILE p ON p.UserID = r.CustomerID  -- not referenced in query, is this table necessary?
WHERE r.StatusID NOT IN (4,5)
AND r.ProjectedDeliveryDate > @today 
AND r.ProjectedDeliveryDate <= DATEADD(HH, (SELECT MAX(HourNumber) FROM HoursTable), @today)
GROUP BY DATEDIFF(HH, @today, r.ProjectedDeliveryDate), pr.Name) t1 ON t1.HourNumber = ht.HourNumber

I took out the HASH JOIN, better to leave out hints until you've exhausted all other options. I changed the CONVERT() date stuff to a variable to avoid implicit conversions (I'm assuming ProjectedDeliveryDate is a datetime and is indexed) I also added a date range since you don't care about delivery dates past what the HoursTable would cover.

My only other questions are the join to the Profile table (see comment) and the group by (do you need pr.Name? why? you don't reference it anywhere else) If you don't need pr.Name in the GROUP BY, you can remove the join to Product entirely.

Lastly, if all the columns reference are needed, then make sure that ProjectedDeliveryDate, the ID columns being joined, and StatusID are indexed. If you're seeing table scans or clustered index scans anywhere you'll want to look at those tables carefully and add some indexes.

influent
Constraint Violating Yak Guru

367 Posts

Posted - 2011-05-05 : 15:13:46

robvolk: Above and beyond the call, thanks so much. The Profile actually is used in the WHERE clause but I removed it before posting because I knew it wasn't affecting the performance significantly and I wanted to make it easier to read for y'all. Without the HASH JOIN the query takes 30 seconds, so I need to leave that in. I removed the unnecessary grouping, it was left over from an earlier version. Your use of the @today variable seemed to cut the run time in half at first, but now I'm getting much more random execution times on successive calls, anywhere from 1 second to 5 seconds (usually 1 though, so maybe due to locking if that's possible even though I'm using READ UNCOMMITTED).

denis_the_thief
Aged Yak Warrior

596 Posts

Posted - 2011-05-05 : 15:24:09

quote:
Originally posted by influent

...
I've noticed that if I only select HourNumber the query returns instantly, but selecting OrderCount as well slows it down. I don't get it. Aren't the joins performed whether I return a column from the joined table or not??
...

This may explain it although I am not saying it is necessarily applicable to your situation.

Hours
-----
ID - PK
ProductID
HourNumber
OrderCount

and let's say you have a non-clustered index (ProductID, HourNumber)

let's say your query1 is:

select
  HourNumber
from
  Hours
where
  ProductID = 'xxx'

let's say your query2 is:

select
  HourNumber, OrderCount
from
  Hours
where
  ProductID = 'xxx'

Even though in Query2, all you added was an extra column, it could be much slower than Query1. With Query1, all SQL Server needs to look at is the index. With Query2, for every match in the the index, it has do to a KeyLookup to get this extra column, which could cause a slow down.

It's quite possible you are experiencing a similar phenomenon, though I can't say for sure...

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-05-05 : 15:27:20

If you're using READ UNCOMMITTED then locking is not a factor (and it has no bearing on execution plan anyway).

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources