Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 optimize query

Author  Topic 

joe8079
Posting Yak Master

127 Posts

Posted - 2013-03-23 : 15:52:24
at work, I have a query that returns results from the max date within each account number. Within each account number there can be 10 to 100 records and the database is over 354 million rows and growing. Part of the query I'm trying to optimize is in the where clause:

select columnA, columnB
from tablea
where columnA in (select max(date) from tablea AA where aa.account = a.account) -- this works fine, but I'd like to use the exists operator somehow instead of this type of subquery. whenever I use exists, it returns all the rows.


So its easier to understand, I used this same logic against two queries in adventure works: query 1 works fine and returns only the max value per salesID, but query 2 returns everything:





query 1 -- query returns the max linetotal per salesID, which works fine. How can I use the exists operator instead of the subquery in the where clause using IN

; with cte as (
SELECT [SalesOrderID]
,[SalesOrderDetailID]
,[CarrierTrackingNumber]
,[OrderQty]
,[ProductID]
,[SpecialOfferID]
,[UnitPrice]
,[UnitPriceDiscount]
,[LineTotal]
,[rowguid]
,[ModifiedDate]
FROM [AdventureWorks].[Sales].[SalesOrderDetail]

)

select * from cte where linetotal in (select MAX(linetotal) from [AdventureWorks].[Sales].[SalesOrderDetail] r where r.SalesOrderID = cte.SalesOrderID )




query 2 -- returns all rows and doesnt filter the max linetotal in the where clause


; with cte as (
SELECT [SalesOrderID]
,[SalesOrderDetailID]
,[CarrierTrackingNumber]
,[OrderQty]
,[ProductID]
,[SpecialOfferID]
,[UnitPrice]
,[UnitPriceDiscount]
,[LineTotal]
,[rowguid]
,[ModifiedDate]
FROM [AdventureWorks].[Sales].[SalesOrderDetail]

)

select * from cte where exists (select top 1 linetotal from [AdventureWorks].[Sales].[SalesOrderDetail] r where r.SalesOrderID = cte.SalesOrderID order by linetotal desc)

joe8079
Posting Yak Master

127 Posts

Posted - 2013-03-23 : 15:58:00
actually, I think I found something that might work:
when I do this with cross apply, it is doing an index seek in the execution plan whereas with the IN operator in the where clause, its doing an index scan:
Is cross apply more effiecent for this type of query?


; with cte as (
SELECT [SalesOrderID]
,[SalesOrderDetailID]
,[CarrierTrackingNumber]
,[OrderQty]
,[ProductID]
,[SpecialOfferID]
,[UnitPrice]
,[UnitPriceDiscount]
,[LineTotal]
,[rowguid]
,[ModifiedDate]
FROM [AdventureWorks].[Sales].[SalesOrderDetail]


)

select x.SalesOrderID, x.LineTotal, x.ModifiedDate from cte cross apply (select top 1 * from [AdventureWorks].[Sales].[SalesOrderDetail] r where r.SalesOrderID = cte.SalesOrderID order by linetotal desc) x
Go to Top of Page

James K
Master Smack Fu Yak Hacker

3873 Posts

Posted - 2013-03-23 : 17:11:22
You could also use the row_number() function like shown below. If there is a possibility that there can be two rows for the same SalesOrderID that have the largest LineTotal and if you want to get all of them, use RANK instead of ROW_NUMBER
; with cte as (
SELECT [SalesOrderID]
,[SalesOrderDetailID]
,[CarrierTrackingNumber]
,[OrderQty]
,[ProductID]
,[SpecialOfferID]
,[UnitPrice]
,[UnitPriceDiscount]
,[LineTotal]
,[rowguid]
,[ModifiedDate]
,ROW_NUMBER() OVER (PARTITION BY SalesOrderID ORDER BY linetotal DESC) AS RN
-- or this
-- ,RANK() OVER (PARTITION BY SalesOrderID ORDER BY linetotal DESC) AS RN
FROM [AdventureWorks].[Sales].[SalesOrderDetail]
)
SELECT x.SalesOrderID, x.LineTotal, x.ModifiedDate from cte x
WHERE RN = 1;
Go to Top of Page
   

- Advertisement -