Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Tips to optimize my query?

Author  Topic 

hk13
Starting Member

12 Posts

Posted - 2015-03-31 : 16:20:59
Hello,

I've mainly been self-learning SQL whenever my job calls for queries to be created. My knowledge is still fairly basic, but I feel I'm learning a lot.

I was able to do what was needed in my most recent request, but it's running very slowly. Depending on the branch, it can take over 10 seconds to generate. The slowness only started happening after I added the more complicated columns LastSold, LastReceipt, and Expired. Before that, it would load in under a second. More specifically, it only seems slow if it's trying to do calculations from both the SA3 and PA tables. If I comment out one of those, it goes back to loading fairly quickly.

Having said all that, what recommendations do you guys have for optimizing the query below? Let me know if you require more information, and thanks in advance.



SELECT DISTINCT
B.BranchCode,
P.ProductCode,
PG1.Name [ProductGroup1],
P.Description,
PSO.ShelfLife,
S.StockAvailable,
S.StockActual,
S.StockAllocated,
CONVERT(INT,ROUND(CONVERT(NUMERIC(18,4),DATEDIFF(dd,MAX(SA3.DocumentDate),GETDATE()))/7,0)) [LastSold],
CONVERT(INT,ROUND(CONVERT(NUMERIC(18,4),DATEDIFF(dd,MAX(PA.DocumentDate),GETDATE()))/7,0)) [LastReceipt],
CASE WHEN CONVERT(INT,ROUND(CONVERT(NUMERIC(18,4),DATEDIFF(dd,MAX(PA.DocumentDate),GETDATE()))/7,0)) >= PSO.ShelfLife
THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END [Expired],
P.ProductID

FROM Product P WITH(NOLOCK)

LEFT JOIN Stock S WITH(NOLOCK) ON S.ProductID = P.ProductID AND S.StockLocationID = 0
LEFT JOIN Branch B WITH(NOLOCK) ON B.BranchID = S.BranchID
LEFT JOIN ProductStockOption PSO WITH(NOLOCK) ON PSO.ProductID = P.ProductID AND PSO.BranchID = S.BranchID
LEFT JOIN ProductGroup PG WITH(NOLOCK) ON PG.ProductGroupID = P.ProductGroupID
LEFT JOIN ProductGroup PG1 WITH(NOLOCK) ON PG1.ProductGroupID = PG.Level1ID
LEFT JOIN PurchaseAnalysisDetail PA WITH(NOLOCK) ON PA.ProductID = P.ProductID AND PA.DocumentType = 'Receipt'
LEFT JOIN SalesAnalysis3 SA3 WITH(NOLOCK) ON SA3.ProductID = P.ProductID AND SA3.DocumentType = 'Invoice'

WHERE P.Deleted = 0 AND PSO.ShelfLife > 0 AND S.StockAvailable > 0 AND B.BranchID = 8-- %BranchID%

GROUP BY P.ProductID, B.BranchCode, P.ProductCode, PG1.Name, P.Description, PSO.ShelfLife, S.StockAvailable, S.StockActual, S.StockAllocated

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2015-03-31 : 20:07:25
1. Why are you using NOLOCK? You'll get dirty reads. (and it won't make your query any faster)

2. why DATEDIFF(dd,MAX(SA3.DocumentDate),GETDATE()))/7 ? Why hot DATEDIFF(wk, ...) instead? More readable

most important:

3. Are the ON and WHERE predicates indexed?
4. Are statistics up to date on all the tables
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-04-01 : 04:38:27
quote:
Originally posted by gbritton

1. Why are you using NOLOCK? You'll get dirty reads. (and it won't make your query any faster)



I'd go further!

@hk13 : Don't EVER use NOLOCK. If you think you need to use NOLOCK see me first

If you have locking / deadlock problems consider setting the database to READ_COMMITTED_SNAPSHOT. That's been the solution for us since it was introduced (SQL 2005 I think)

When I am trying to improve performance I do two things. First I put this around the query:

SET STATISTICS IO ON; SET STATISTICS TIME ON

... put query here ...

SET STATISTICS IO OFF; SET STATISTICS TIME OFF
GO

and I look at the SCAN COUNT (low numbers, particularly 1 ) is good. Lower LOGICAL READS are good too.

I trying adding / dropping indexes (e..g on columns used in the JOIN and WHERE) to see what improves things

Slightly more technical is to look at the query plan. For that I use:

SET SHOWPLAN_TEXT ON
GO

... put query here ...

SET SHOWPLAN_TEXT OFF
GO

the main thing I am looking for is a SCAN which uses the Primary Key (index) and then does a test on a column that is NOT in the primary key. This stuff can look pretty daunting to a Newbie though ...
Go to Top of Page

hk13
Starting Member

12 Posts

Posted - 2015-04-01 : 09:26:42
I'm only ever doing queries for the purposes of generating reports, so I don't really have much of a foundation for everything else involved in databases.

The majority of my self-learning is from looking at what the developers did with the included SQL queries and then working backwards to create something new that my supervisor is requesting. All the ones they've created always use NOLOCK unless it's joining a temporary table, so I've always included it as well just to be safe.

I think I need to look into indexes, as that is what my quick research seems to indicate as the best method for optimization.

I'll need to look into statistics, too, as that's a completely new concept for me.

Thanks for the all the pointers; I'll give them a try.
Go to Top of Page

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2015-04-01 : 09:32:45
"All the ones they've created always use NOLOCK unless it's joining a temporary table, so I've always included it as well just to be safe."

There's nothing safe about it. In fact, the opposite is true. NOLOCK exposes your query to dirty reads. As Kristen said, DON'T USE IT! (and educate the other devs.)
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-04-01 : 12:53:58
quote:
Originally posted by hk13

All the ones they've created always use NOLOCK


There were reasons, once upon a time, for doing that. At that time SQL did not have easy ways to preventing blocking - e.g. when someone updated a record other people trying to read that record could get blocked (I'm over simplifying, but you'll get the general idea)

Since SQL 2005 (can't remember when that was released, maybe 2004?? around then anyway) there have been better ways to do that (in particular setting the database to READ_COMMITED_SNAPSHOT). Sounds like your developers have been around since before SQL 2005, or their "knowledge" of how to write queries, "house style", or perhaps the templated-code that they use, has not changed since then.

If your company has business-critical code that includes NOLOCK there are all sorts of disasters that could happen. Financial / Stock-level / Other information could be inaccurate. If the boss makes business decisions based on reports that include NOLOCK he or she may commit the company based on data that is just plain wrong.

Here's why it is bad:

If you include NOLOCK in your query then your query will read the table regardless of what locks SQL puts on it.

Lets say that someone else is in the process of Inserting/Updating, a bit later on the logic in the application may say "Can't do that" and the transaction is rolled back. If your report runs between the user saying "Save" and the application logic saying "Abort" your report will include that data. Lets say the user enters an invoice for $1bn, instead of $1m ... and then some logic says "Stop! That's way too big" that figure would be in your report. Same thing where someone puts in the Code for a Customer that does not exist (anything like that). SQL tries to write the data, it gets included in your report, some logic down the line then say "Customer does not exist" and the transaction is rolled back. But you got that data in your report, you tried to include the non-existent customer details in your report and that caused your report to blow up.

Even if all the validation logic is discrete so that doesn't happen you are still at risk:

Someone inserts a new record. The page in the index is full ... so SQL splits the page into two. The new half-page has to be added a bit further down the list of pages.

However, your query reads the index page just before it gets split, you include all those records in your report. A bit later on your report then gets to the newly inserted, half page, of values that were just split. Your report includes all those too. They are duplicate values (except for the one new record), so the figures in your report now contain duplicate rows and are wrong. Possibly worse: you immediately press refresh and now the index page is not being split, so you get the correct figures, but what you actually see is one figure, press refresh and get a different one!

If your report ran slightly later it would get the split half-page, and include those records, but if SQL was slow inserting the newly split half-page that would not be there when your report got to that part of the index, so your report would never include the second half-page of the index. Now your report is missing those extra records.

These things only happen on occasions. They are virtually impossible to reproduce because they are so time-critical. They may cause anything from an application error (and then, mysteriously, when you press REFRESH the problem has gone away) to a report that includes wrong, duplicate or missing records.

Absolutely and without doubt do not use NOLOCK. Further more, do your company and your boss a favour and make them aware of how dangerous it is.
Go to Top of Page

hk13
Starting Member

12 Posts

Posted - 2015-04-01 : 15:45:11
I will definitely forward on what you guys mentioned about NOLOCK to my supervisor. Thanks again for the information.

As for my query, I wasn't sure how to use the statistics. Was a new window supposed to come up where I could look at ScanCount? I'm guessing this method is a bit beyond my level of skill.

Other than creating a stored procedure, I've avoided anything that involved adding/removing from the database, so I couldn't pursue the indexing route either.

I did manage to find a solution using subqueries, though. It brought down all my branch queries to around 2 seconds. The most impressive was my all branches query went from around 3-4 mins to just 3 seconds. It's my first time using subqueries, so if you have any recommendations on doing it better/neater, let me know!

Thanks again for all the help. It was a great learning experience.

SELECT DISTINCT
B.BranchCode,
P.ProductCode,
PG1.Name [ProductGroup1],
P.Description,
PSO.ShelfLife,
S.StockAvailable,
S.StockActual,
S.StockAllocated,
(SELECT DATEDIFF(wk,MAX(ST1.StockTransactionDate),GETDATE())
FROM StockTransaction ST1
WHERE ST1.ProductID = P.ProductID AND ST1.StockTransactionType = 12)
[LastSold(Weeks)],
(SELECT MAX(ST1.StockTransactionDate)
FROM StockTransaction ST1
WHERE ST1.ProductID = P.ProductID AND ST1.StockTransactionType = 12)
[LastSoldDate],
(SELECT DATEDIFF(wk,MAX(ST2.StockTransactionDate),GETDATE())
FROM StockTransaction ST2
WHERE ST2.ProductID = P.ProductID AND ST2.StockTransactionType IN (6,13))
[LastReceipt(Weeks)],
(SELECT MAX(ST2.StockTransactionDate)
FROM StockTransaction ST2
WHERE ST2.ProductID = P.ProductID AND ST2.StockTransactionType IN (6,13))
[LastReceiptDate],
(SELECT CASE WHEN DATEDIFF(wk,MAX(ST2.StockTransactionDate),GETDATE()) >= (PSO.ShelfLife-4)
THEN CAST(1 AS BIT)
WHEN MAX(ST2.StockTransactionDate) IS NULL
THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END
FROM StockTransaction ST2
WHERE ST2.ProductID = P.ProductID AND ST2.StockTransactionType IN (6,13))
[Expired],
P.ProductID
Go to Top of Page

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2015-04-01 : 16:10:44
You can make it shorter. Here's what I did with the part you posted:


SELECT DISTINCT B.BranchCode
, P.ProductCode
, PG1.NAME [ProductGroup1]
, P.Description
, PSO.ShelfLife
, S.StockAvailable
, S.StockActual
, S.StockAllocated
, [LastSold(Weeks)]
, [LastSoldDate]
, [LastReceipt(Weeks)]
, [LastReceiptDate]
, [Expired]
, P.ProductId

FROM Products p
Join Stock S
CROSS APPLY
(
SELECT DATEDIFF(wk, MAX(ST1.StockTransactionDate), GETDATE())
, MAX(ST1.StockTransactionDate)
FROM StockTransaction ST1
WHERE ST1.ProductID = P.ProductID
AND ST1.StockTransactionType = 12
) _1([LastSold(Weeks)], [LastSoldDate])

CROSS APPLY
(
SELECT DATEDIFF(wk, MAX(ST2.StockTransactionDate), GETDATE())
, MAX(ST2.StockTransactionDate)
, CASE
WHEN DATEDIFF(wk, MAX(ST2.StockTransactionDate), GETDATE()) >= (PSO.ShelfLife - 4)
THEN CAST(1 AS BIT)
WHEN MAX(ST2.StockTransactionDate) IS NULL
THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END
FROM StockTransaction ST2
WHERE ST2.ProductID = P.ProductID
AND ST2.StockTransactionType IN (
6
, 13
)
) _2([LastReceipt(Weeks)],[LastReceiptDate],[Expired])


Go to Top of Page
   

- Advertisement -