| Author |
Topic |
|
influent
Constraint Violating Yak Guru
367 Posts |
Posted - 2011-05-04 : 18:01:34
|
| I'm using the query below to see orders per hour in a reporting environment and I need to speed it up. I've noticed that if I only select HourNumber the query returns instantly, but selecting OrderCount as well slows it down. I don't get it. Aren't the joins performed whether I return a column from the joined table or not?? It only returns 72 rows of integer data, so it's not like it's taking time for Mgmt Studio to display it.SELECT ht.HourNumber, ISNULL(t1.OrderCount,0) OrderCountFROM HoursTable htLEFT HASH JOIN (SELECT DATEDIFF(HH, CONVERT(varchar(8), GETDATE(), 112), r.ProjectedDeliveryDate) HourNumber, COUNT(*) as OrderCount FROM CurrentOrderStatus r JOIN OrderProduct rp ON rp.OrderID = r.OrderID AND rp.ExpiredDate IS NULL JOIN Product pr ON pr.ProductID = rp.ProductID AND pr.ProductID = 7 JOIN Profile p ON p.UserID = r.CustomerID WHERE r.StatusID NOT IN (4,5) AND r.ProjectedDeliveryDate > CONVERT(varchar(8), GETDATE(), 112) GROUP BY DATEDIFF(HH, CONVERT(varchar(8), GETDATE(), 112), r.ProjectedDeliveryDate), pr.Name) t1 ON t1.HourNumber = ht.HourNumber |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
influent
Constraint Violating Yak Guru
367 Posts |
Posted - 2011-05-04 : 19:03:13
|
| It's too gigantic to show when the OrderCount is selected, but when it's not it's clearly not looking at the joined piece at all because it's only doing a table scan of HoursTable. I guess that makes sense since it's a left join. The subquery that makes up t1 is where the slowness is, I'll have to break that into pieces so I can track down which table. Thanks. I guess I don't give SQL Server enough credit sometimes. |
 |
|
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2011-05-04 : 19:46:56
|
| With an outer join that does use any of the columns from the joined table, the SQL engine is smart enough to realize that it doesn't need to perform the join at all. The results are mathematically equivalent.=======================================Elitism is the slur directed at merit by mediocrity. -Sydney J. Harris, journalist (1917-1986) |
 |
|
|
influent
Constraint Violating Yak Guru
367 Posts |
Posted - 2011-05-05 : 12:53:31
|
| So I figured out that the query takes 3 seconds the first time its run and 0 seconds after that because of the cached query plan, but for whatever reason from Reporting Services it always takes 3 seconds. How can I force the query plan to stay cached? |
 |
|
|
jimf
Master Smack Fu Yak Hacker
2875 Posts |
Posted - 2011-05-05 : 12:55:53
|
| Make it a view or a stored procedure.JimEveryday I learn something that somebody else already knew |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2011-05-05 : 12:58:25
|
quote: but for whatever reason from Reporting Services it always takes 3 seconds
My guess is it's rendering and network time, especially if it's that consistent. Not much you can do there re: query tuning. |
 |
|
|
influent
Constraint Violating Yak Guru
367 Posts |
Posted - 2011-05-05 : 13:59:14
|
| I tried making it a stored procedure and it still takes 3 seconds every time. I'm seeing the 3 seconds from the EXEC in Profiler using the Duration template, so it's definitely not related to rendering and network time, though that was a good guess. I also tried OPTION USE PLAN but am getting this error, possibly because I have local vars inside the procedure:Cannot execute query because of incorrectly formed XML plan in USE PLAN hint. Verify that XML plan is a legal plan suitable for plan forcing. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2011-05-05 : 14:22:18
|
I made some changes:DECLARE @today DATETIMESET @today=DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) -- avoid implicit conversionsSELECT ht.HourNumber, ISNULL(t1.OrderCount,0) OrderCountFROM HoursTable htLEFT JOIN (SELECT DATEDIFF(HH, @today, r.ProjectedDeliveryDate) HourNumber, COUNT(*) AS OrderCountFROM CurrentOrderStatus rJOIN OrderProduct rp ON rp.OrderID = r.OrderID AND rp.ExpiredDate IS NULL AND rp.ProductID = 7JOIN Product pr ON pr.ProductID = rp.ProductID JOIN PROFILE p ON p.UserID = r.CustomerID -- not referenced in query, is this table necessary?WHERE r.StatusID NOT IN (4,5)AND r.ProjectedDeliveryDate > @today AND r.ProjectedDeliveryDate <= DATEADD(HH, (SELECT MAX(HourNumber) FROM HoursTable), @today)GROUP BY DATEDIFF(HH, @today, r.ProjectedDeliveryDate), pr.Name) t1 ON t1.HourNumber = ht.HourNumber I took out the HASH JOIN, better to leave out hints until you've exhausted all other options. I changed the CONVERT() date stuff to a variable to avoid implicit conversions (I'm assuming ProjectedDeliveryDate is a datetime and is indexed) I also added a date range since you don't care about delivery dates past what the HoursTable would cover.My only other questions are the join to the Profile table (see comment) and the group by (do you need pr.Name? why? you don't reference it anywhere else) If you don't need pr.Name in the GROUP BY, you can remove the join to Product entirely.Lastly, if all the columns reference are needed, then make sure that ProjectedDeliveryDate, the ID columns being joined, and StatusID are indexed. If you're seeing table scans or clustered index scans anywhere you'll want to look at those tables carefully and add some indexes. |
 |
|
|
influent
Constraint Violating Yak Guru
367 Posts |
Posted - 2011-05-05 : 15:13:46
|
| robvolk: Above and beyond the call, thanks so much. The Profile actually is used in the WHERE clause but I removed it before posting because I knew it wasn't affecting the performance significantly and I wanted to make it easier to read for y'all. Without the HASH JOIN the query takes 30 seconds, so I need to leave that in. I removed the unnecessary grouping, it was left over from an earlier version. Your use of the @today variable seemed to cut the run time in half at first, but now I'm getting much more random execution times on successive calls, anywhere from 1 second to 5 seconds (usually 1 though, so maybe due to locking if that's possible even though I'm using READ UNCOMMITTED). |
 |
|
|
denis_the_thief
Aged Yak Warrior
596 Posts |
Posted - 2011-05-05 : 15:24:09
|
quote: Originally posted by influent ...I've noticed that if I only select HourNumber the query returns instantly, but selecting OrderCount as well slows it down. I don't get it. Aren't the joins performed whether I return a column from the joined table or not?? ...
This may explain it although I am not saying it is necessarily applicable to your situation.Hours-----ID - PKProductIDHourNumberOrderCountand let's say you have a non-clustered index (ProductID, HourNumber)let's say your query1 is:select HourNumberfrom Hourswhere ProductID = 'xxx' let's say your query2 is:select HourNumber, OrderCountfrom Hourswhere ProductID = 'xxx' Even though in Query2, all you added was an extra column, it could be much slower than Query1. With Query1, all SQL Server needs to look at is the index. With Query2, for every match in the the index, it has do to a KeyLookup to get this extra column, which could cause a slow down.It's quite possible you are experiencing a similar phenomenon, though I can't say for sure... |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2011-05-05 : 15:27:20
|
| If you're using READ UNCOMMITTED then locking is not a factor (and it has no bearing on execution plan anyway). |
 |
|
|
|