sort is slow - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

sort is slow

Author

Topic

jannable
Starting Member

25 Posts

Posted - 2006-12-27 : 16:49:48

I'm analyzing the execution plan of a query that is taking 4.5s to execute. It's doing a couple basic MAX and SUM operations alongside a few INNER JOINs. There are only 1239 records being returned. Anyway, the execution plan reveals two sorts that happen at the very end of the statement (a stream aggregate is in between each of them). The two sorts are each 32% of the entire cost of the query. So the sorts alone account for 64% of the total cost. I put an index on ProjectExpenseHistory.StepDateTime but it didn't make a difference.

Any ideas as to what I should be checking out?


SELECT Account.ID AS AccountID,
	Event.Number AS EventNumber,
	Client.Name AS ClientName,
	MAX(ProjectExpenseHistory.StepDateTime) AS StepDateTime,
	SUM(CASE WHEN LastVersion.Size = 'S' THEN 1 ELSE 0 END) AS SmallProjects,
	SUM(CASE WHEN LastVersion.Size = 'L' THEN 1 ELSE 0 END) AS LargeProjects,
	SUM(CASE WHEN ProjectExpense.Type = 1 THEN 1 ELSE 0 END) AS ApplicantRequestCount
FROM ProjectExpense
INNER JOIN ProjectExpenseHistory
	ON ProjectExpense.ID = ProjectExpenseHistory.ProjectExpenseID
INNER JOIN Project
	ON Project.ID = ProjectExpense.ProjectID
INNER JOIN ProjectVersion AS LastVersion
	ON LastVersion.ProjectID = Project.ID
	AND LastVersion.ID = (SELECT MAX(ID) FROM ProjectVersion WHERE ProjectID = Project.ID)
INNER JOIN Account
	ON Account.ID = Project.AccountID
INNER JOIN Client
	ON Client.ID = Account.ClientID
INNER JOIN Event
	ON Event.ID = Account.EventID	
	AND ProjectExpense.StepSequence = 5
	AND (1 = 0
		OR LastVersion.Size = 'L'
		OR LastVersion.Size = 'S')
GROUP BY Account.ID,
	Event.Number,
	Client.Name
ORDER BY StepDateTime DESC

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-12-27 : 17:16:11

An index by StepDateTime isn't going to help because your ORDER BY comes after a GROUP BY based on completely different columns. Do you have an indexes on your join columns?

What happens if you use a WHERE clause for the filter instead of putting the filter in one of the joins like you're doing now?

SELECT Account.ID AS AccountID,
	Event.Number AS EventNumber,
	Client.Name AS ClientName,
	MAX(ProjectExpenseHistory.StepDateTime) AS StepDateTime,
	SUM(CASE WHEN LastVersion.Size = 'S' THEN 1 ELSE 0 END) AS SmallProjects,
	SUM(CASE WHEN LastVersion.Size = 'L' THEN 1 ELSE 0 END) AS LargeProjects,
	SUM(CASE WHEN ProjectExpense.Type = 1 THEN 1 ELSE 0 END) AS ApplicantRequestCount
FROM ProjectExpense
INNER JOIN ProjectExpenseHistory
	ON ProjectExpense.ID = ProjectExpenseHistory.ProjectExpenseID
INNER JOIN Project
	ON Project.ID = ProjectExpense.ProjectID
INNER JOIN ProjectVersion AS LastVersion
	ON LastVersion.ProjectID = Project.ID
INNER JOIN Account
	ON Account.ID = Project.AccountID
INNER JOIN Client
	ON Client.ID = Account.ClientID
INNER JOIN Event
	ON Event.ID = Account.EventID	
WHERE ProjectExpense.StepSequence = 5
	AND (LastVersion.Size = 'L' OR LastVersion.Size = 'S')
	AND LastVersion.ID = (SELECT MAX(ID) 
		FROM ProjectVersion 
		WHERE ProjectID = Project.ID)
GROUP BY Account.ID,
	Event.Number,
	Client.Name
ORDER BY StepDateTime DESC

jannable
Starting Member

25 Posts

Posted - 2006-12-27 : 17:25:23

I have indexes on all of the join columns. I'm not sure how it happened by my WHERE statement didn't get copied into this forum properly. (Good eye!) The WHERE clause from the previous post is what I meant to paste in here.

snSQL
Master Smack Fu Yak Hacker

1837 Posts

Posted - 2006-12-27 : 17:34:39

You said there only 1239 records returned, but how much data is there? There may be millions of rows being queried?

jannable
Starting Member

25 Posts

Posted - 2006-12-27 : 17:42:01

The record counts still aren't very high but it's definately worth mentioning.

ProjectExpense: 16185
ProjectExpenseHistory: 67693
ProjectVersion: 21016
Project: 18947
Account: 3263
Client: 2629
Event: 5

Kristen
Test

22859 Posts

Posted - 2006-12-28 : 07:25:35

Wild guess : Does this perform any better?


SELECT *
FROM
(
SELECT Account.ID AS AccountID,
	Event.Number AS EventNumber,
	Client.Name AS ClientName,
	MAX(ProjectExpenseHistory.StepDateTime) AS StepDateTime,
	SUM(CASE WHEN LastVersion.Size = 'S' THEN 1 ELSE 0 END) AS SmallProjects,
	SUM(CASE WHEN LastVersion.Size = 'L' THEN 1 ELSE 0 END) AS LargeProjects,
	SUM(CASE WHEN ProjectExpense.Type = 1 THEN 1 ELSE 0 END) AS ApplicantRequestCount
FROM ProjectExpense
INNER JOIN ProjectExpenseHistory
	ON ProjectExpense.ID = ProjectExpenseHistory.ProjectExpenseID
INNER JOIN Project
	ON Project.ID = ProjectExpense.ProjectID
INNER JOIN ProjectVersion AS LastVersion
	ON LastVersion.ProjectID = Project.ID
	AND LastVersion.ID = (SELECT MAX(ID) FROM ProjectVersion WHERE ProjectID = Project.ID)
INNER JOIN Account
	ON Account.ID = Project.AccountID
INNER JOIN Client
	ON Client.ID = Account.ClientID
INNER JOIN Event
	ON Event.ID = Account.EventID	
	AND ProjectExpense.StepSequence = 5
	AND (1 = 0
		OR LastVersion.Size = 'L'
		OR LastVersion.Size = 'S')
GROUP BY Account.ID,
	Event.Number,
	Client.Name
) AS X
ORDER BY StepDateTime DESC

If not what is the execution time with / without the ORDER BY (using your original query)? Seems very odd to me that sorting 1,200 rows should take over a second.

Kristen

jannable
Starting Member

25 Posts

Posted - 2006-12-28 : 21:27:18

Negative. It showed two sorts each at 32% of the overall cost of the query. Here's a link to the execution plan if it helps. [url]http://jannable.com/execplan.zip[/url]

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2006-12-29 : 06:13:09

The query plan shows index scans instead of index seeks for every join and this is what's causing the performance hit. It seems that it's using an index where the join column is not the first column. You should try to reorder the columns in your indexes (or create new ones) and also post the query plan for the correct query...the plan you posted is for Kristens query while you should post the one from snSQL.

--
Lumbago
"Real programmers don't document, if it was hard to write it should be hard to understand"

Kristen
Test

22859 Posts

Posted - 2006-12-29 : 06:22:16

FWIW I think that the graphical display and "percentage" based query plan data is not much help.

I look at the text based plans to see what types of scan/seek are being used, and whether indexes are being used, and then I look at the number of logical I/Os from the Query Statistics.

This is what I use:


-- Clear cache (for level playing field 
--    - only if wanting to check PHYSICAL performance rather than LOGICAL performance)
-- Do **NOT** run these on a Live server
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

-- Comment in the SHOWPLAN to see the Query Plan, OR the STATISTICS to see the logical stats
-- SET SHOWPLAN_TEXT ON
GO
-- SET STATISTICS IO ON; SET STATISTICS TIME ON

-- ... put query here - e.g.:

SELECT * FROM Northwind.dbo.Products

SET STATISTICS IO OFF; SET STATISTICS TIME OFF
GO
SET SET SHOWPLAN_TEXT OFF
GO

Kristen

jannable
Starting Member

25 Posts

Posted - 2006-12-29 : 10:21:21

Ok. So here's what I tried. All files can be found here again: [url]http://jannable.com/execplan.zip[/url]

I ran the execution plan for the query from snSQL. I noticed that it is an index scan for the ProjectExpenseHistory table. I saved that plan under the name "snSQL-BEFORE". I then added an index for ProjectExpenseID in the ProjectExpenseHistory table and re-ran the query. The execution plan is saved in a file called "snSQL-AFTER". Looking at the plan, I didn't see a difference. So I included the files "table-ProjectExpense", "table-ProjectExpenseHistory", "indexes-ProjectExpense", and "indexes-ProjectExpenseHistory" to provide more info. I must not be creating the indexes properly or something.

As for Kristen's suggestion, I was quite intimidated trying to understand the results. I cleared the buffer then ran with SHOWPLAN on. I saved it in a file called "Kristen-showplan". I cleared the buffer again and then ran the query with STATISTICS IO and STATISTICS TIME on. I saved those results in a file called "Kristen-stats". I hope they make sense to somebody.

Kristen
Test

22859 Posts

Posted - 2006-12-30 : 03:03:40

"I was quite intimidated trying to understand the results"

Yeah, I know what you mean!

The STATISTICS one is easy - you want to get the logical scans as low as possible. If the figure is high then there is big room for improvement, usually by adding an index.

For the SHOWPLAN you are basically interested in whether you are getting SCAN or SEEK. SCAN implies that indexes are not being used (even if it says INDEX SCAN - in which case you will probably see that the Clustered/PK index is being used, regardless of the fact that there may be a seemingly more appropriate index available)

Because the output is Text it has the benefit that you can cut & paste it here - speaking for myself I'm sorry but I don't have time to download a ZIP file and examine it etc.

Kristen

jannable
Starting Member

25 Posts

Posted - 2006-12-30 : 23:36:17

Thanks for the info! I've included the text info below for simplicity:


  |--Sort(ORDER BY:([Expr1025] DESC))
       |--Stream Aggregate(GROUP BY:([LA].[dbo].[Account].[ID], [LA].[dbo].[Event].[Number], [LA].[dbo].[Client].[Name]) DEFINE:([Expr1025]=MAX([LA].[dbo].[ProjectExpenseHistory].[StepDateTime]), [Expr1026]=SUM([Expr1029]), [Expr1027]=SUM([Expr1030]), [Expr1028]=SUM([Expr1031])))
            |--Sort(ORDER BY:([LA].[dbo].[Project].[AccountID] ASC, [LA].[dbo].[Event].[Number] ASC, [LA].[dbo].[Client].[Name] ASC))
                 |--Hash Match(Inner Join, HASH:([LA].[dbo].[Event].[ID])=([LA].[dbo].[Account].[EventID]), RESIDUAL:([LA].[dbo].[Event].[ID]=[LA].[dbo].[Account].[EventID]))
                      |--Clustered Index Scan(OBJECT:([LA].[dbo].[Event].[PK_Disaster]))
                      |--Hash Match(Inner Join, HASH:([LA].[dbo].[Client].[ID])=([LA].[dbo].[Account].[ClientID]), RESIDUAL:([LA].[dbo].[Account].[ClientID]=[LA].[dbo].[Client].[ID]))
                           |--Clustered Index Scan(OBJECT:([LA].[dbo].[Client].[PK_Applicant]))
                           |--Hash Match(Inner Join, HASH:([LA].[dbo].[Account].[ID])=([LA].[dbo].[Project].[AccountID]), RESIDUAL:([LA].[dbo].[Project].[AccountID]=[LA].[dbo].[Account].[ID]))
                                |--Clustered Index Scan(OBJECT:([LA].[dbo].[Account].[PK_Account]))
                                |--Hash Match(Inner Join, HASH:([LA].[dbo].[ProjectExpense].[ID])=([LA].[dbo].[ProjectExpenseHistory].[ProjectExpenseID]), RESIDUAL:([LA].[dbo].[ProjectExpense].[ID]=[LA].[dbo].[ProjectExpenseHistory].[ProjectExpenseID]))
                                     |--Hash Match(Inner Join, HASH:([Expr1023], [LA].[dbo].[ProjectVersion].[ProjectID])=([LastVersion].[ID], [LastVersion].[ProjectID]), RESIDUAL:([LA].[dbo].[ProjectVersion].[ID] as [LastVersion].[ID]=[Expr1023] AND [LA].[dbo].[ProjectVersion].[ProjectID] as [LastVersion].[ProjectID]=[LA].[dbo].[ProjectVersion].[ProjectID]))
                                     |    |--Hash Match(Inner Join, HASH:([LA].[dbo].[ProjectExpense].[ProjectID])=([LA].[dbo].[ProjectVersion].[ProjectID]), RESIDUAL:([LA].[dbo].[ProjectExpense].[ProjectID]=[LA].[dbo].[ProjectVersion].[ProjectID]))
                                     |    |    |--Compute Scalar(DEFINE:([Expr1031]=CASE WHEN [LA].[dbo].[ProjectExpense].[Type]=(1) THEN (1) ELSE (0) END))
                                     |    |    |    |--Clustered Index Scan(OBJECT:([LA].[dbo].[ProjectExpense].[PK_ProjectExpense]), WHERE:([LA].[dbo].[ProjectExpense].[StepSequence]=(5)))
                                     |    |    |--Merge Join(Inner Join, MERGE:([LA].[dbo].[ProjectVersion].[ProjectID])=([LA].[dbo].[Project].[ID]), RESIDUAL:([LA].[dbo].[ProjectVersion].[ProjectID]=[LA].[dbo].[Project].[ID]))
                                     |    |         |--Stream Aggregate(GROUP BY:([LA].[dbo].[ProjectVersion].[ProjectID]) DEFINE:([Expr1023]=MAX([LA].[dbo].[ProjectVersion].[ID])))
                                     |    |         |    |--Index Scan(OBJECT:([LA].[dbo].[ProjectVersion].[_dta_index_ProjectVersion_14_1228713067__K2_13_22_23_26_27_28]), ORDERED FORWARD)
                                     |    |         |--Index Scan(OBJECT:([LA].[dbo].[Project].[_dta_index_Project_14_466282358__K1_2_3_5_6_7_8]), ORDERED FORWARD)
                                     |    |--Compute Scalar(DEFINE:([Expr1029]=CASE WHEN [LA].[dbo].[ProjectVersion].[Size] as [LastVersion].[Size]='S' THEN (1) ELSE (0) END, [Expr1030]=CASE WHEN [LA].[dbo].[ProjectVersion].[Size] as [LastVersion].[Size]='L' THEN (1) ELSE (0) END))
                                     |         |--Clustered Index Scan(OBJECT:([LA].[dbo].[ProjectVersion].[PK_ProjectVersion] AS [LastVersion]), WHERE:([LA].[dbo].[ProjectVersion].[Size] as [LastVersion].[Size]='L' OR [LA].[dbo].[ProjectVersion].[Size] as [LastVersion].[Size]='S'))
                                     |--Clustered Index Scan(OBJECT:([LA].[dbo].[ProjectExpenseHistory].[PK_ProjectExpenseHistory]))

----------------

SQL Server parse and compile time:
CPU time = 3000 ms, elapsed time = 3194 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.

(1239 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ProjectExpenseHistory'. Scan count 1, logical reads 325, physical reads 1, read-ahead reads 323, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ProjectVersion'. Scan count 2, logical reads 795, physical reads 4, read-ahead reads 789, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Project'. Scan count 1, logical reads 85, physical reads 1, read-ahead reads 83, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ProjectExpense'. Scan count 1, logical reads 189, physical reads 1, read-ahead reads 187, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Account'. Scan count 1, logical reads 37, physical reads 1, read-ahead reads 35, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Client'. Scan count 1, logical reads 39, physical reads 1, read-ahead reads 37, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Event'. Scan count 1, logical reads 2, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 1484 ms, elapsed time = 10168 ms.

Kristen
Test

22859 Posts

Posted - 2006-12-31 : 04:27:35

Well you have negligible scan counts and logical reads - which is a bit surprising because there isn't a single Index Seek in the query plan ... but there again there isn't any WHERE clause either (Later: See below)

It takes 1.4 seconds of CPU time and 10 seconds total. Looks to be starved of memory to me, maybe paging out to disk? or maybe the disk(s) are slow in the first place?

Parsing this puppy is 3 seconds - which will most probably be cached if its in a Stored Procedure; if its dynamic SQL then you are probably going to get that hit every time.

Looks to me like there is some stuff buried in the last JOIN that ought to be elsewhere:


INNER JOIN Event
	ON Event.ID = Account.EventID	
	AND ProjectExpense.StepSequence = 5
	AND (1 = 0
		OR LastVersion.Size = 'L'
		OR LastVersion.Size = 'S')

I would try moving that:


INNER JOIN ProjectExpenseHistory
	ON ProjectExpense.ID = ProjectExpenseHistory.ProjectExpenseID
	AND ProjectExpense.StepSequence = 5
INNER JOIN ProjectVersion AS LastVersion
	ON LastVersion.ProjectID = Project.ID
	AND LastVersion.ID = (SELECT MAX(ID) FROM ProjectVersion WHERE ProjectID = Project.ID)
	AND (1 = 0
		OR LastVersion.Size = 'L'
		OR LastVersion.Size = 'S')

I would also try changing the MAX(ID) thing to:


INNER JOIN 
(
	SELECT	[MAX_ProjectID] = ProjectID,
		[MAX_ID] = MAX(ID) 
	FROM	ProjectVersion 
	GROUP BY ProjectID

) AS T_MAX
INNER JOIN ProjectVersion AS LastVersion
	ON LastVersion.ProjectID = Project.ID
	AND LastVersion.ID = T_MAX.MAX_ID
	AND LastVersion.ProjectID = T_MAX.MAX_ProjectID
	AND (1 = 0
		OR LastVersion.Size = 'L'
		OR LastVersion.Size = 'S')

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources