Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 09:55:03
|
| I'm selecting across 6 tables joined in various ways, most of which have hundreds of thousands of records. I know I should be able to do better than 2 seconds for the top 10...google takes a fraction of a second per page.We already have lots of indexes in place. Where should I be looking to try to improve the performance of this sproc?At the moment, the core of it looks something like this:SELECT DISTINCT TOP 10 m.Name, m.Date, m.ActiveFROM Membership mINNER JOIN Entities e ON m.EntityID=e.EntityIDINNER JOIN Individuals i ON e.EntityID=i.IndividualIDLEFT JOIN MemberTypes ON m.MemberTypeID=MemberTypes.MemberTypeIDLEFT JOIN Addresses a ON m.EntityID=a.EntityIDLEFT JOIN AddressTypes at ON a.AddressTypeID=at.AddressTypeIDWHERE m.ID=2 I've left off the other WHERE clauses that necessitate the joins. |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-05-10 : 10:08:57
|
"I've left off the other WHERE clauses that necessitate the joins"In which case I can't comment on why it may be inefficient ... |
 |
|
|
harsh_athalye
Master Smack Fu Yak Hacker
5581 Posts |
Posted - 2007-05-10 : 10:10:09
|
"google takes a fraction of a second per page"Don't compare your setup with that of Google's. It's entirely different architecture, hardware and software wise."We already have lots of indexes in place"Lots of indexes was never a solution for performance tuning. Having appropriate indexes is important.I see there are many LEFT JOIN which are going to cause performance hit. I suggest you profile the logical reads for your query and try to minimize them.For viewing Logical reads, add below statement before your query:SET STATISTICS IO ONGO... -- Your query hereSET STATISTICS IO OFFGO Harsh AthalyeIndia."The IMPOSSIBLE is often UNTRIED" |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 10:12:11
|
quote: Originally posted by Kristen "I've left off the other WHERE clauses that necessitate the joins"In which case I can't comment on why it may be inefficient ... 
The quoted 2 seconds is on the query as posted, not with the additional clauses. I just didn't want anyone saying "why are you joining all those tables just to select from one of them?" |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 10:16:01
|
quote: Originally posted by harsh_athalyeI see there are many LEFT JOIN which are going to cause performance hit. I suggest you profile the logical reads for your query and try to minimize them.For viewing Logical reads, add below statement before your query:SET STATISTICS IO ONGO... -- Your query hereSET STATISTICS IO OFFGO
Ok, I get this:Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Addresses'. Scan count 11, logical reads 199, physical reads 2, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Entities'. Scan count 0, logical reads 376, physical reads 1, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Individuals'. Scan count 0, logical reads 500, physical reads 1, read-ahead reads 6, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Membership'. Scan count 1, logical reads 153, physical reads 3, read-ahead reads 514, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.How does one go about reducing these numbers?I appreciate your help. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-05-10 : 10:19:49
|
Well in that scenario these tables are redundant:INNER JOIN Entities e ON m.EntityID=e.EntityIDINNER JOIN Individuals i ON e.EntityID=i.IndividualIDLEFT JOIN MemberTypes ON m.MemberTypeID=MemberTypes.MemberTypeIDLEFT JOIN Addresses a ON m.EntityID=a.EntityIDLEFT JOIN AddressTypes at ON a.AddressTypeID=at.AddressTypeIDexcept that they may be causing duplicate values form.Name, m.Date, m.Activeif any of them are one-to-many relationships.Plus the two INNER JOINs do enforce that related records must exist, but that's irrelevant if there is referential integrity (where FKs exist, or not) would enforce that anyway.Do as Harsh says and see how what the "scan count" and "logical reads" are. If they are not VERY low numbers then the likelihood is that you don't have the right indexes, and you need to look at the Query Plan to see what is actually being used.SET SHOWPLAN_TEXT ONGO-- ... put query here - e.g.:SET SET SHOWPLAN_TEXT OFFGO but I reckon its a waste of time if this is not your actual real-world WHERE clause, because the query plan will be different when you add in the rest of your WHERE clauseKristen |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 10:28:55
|
quote: Originally posted by Kristen Well in that scenario these tables are redundant:INNER JOIN Entities e ON m.EntityID=e.EntityIDINNER JOIN Individuals i ON e.EntityID=i.IndividualIDLEFT JOIN MemberTypes ON m.MemberTypeID=MemberTypes.MemberTypeIDLEFT JOIN Addresses a ON m.EntityID=a.EntityIDLEFT JOIN AddressTypes at ON a.AddressTypeID=at.AddressTypeIDexcept that they may be causing duplicate values form.Name, m.Date, m.Activeif any of them are one-to-many relationships.Plus the two INNER JOINs do enforce that related records must exist, but that's irrelevant if there is referential integrity (where FKs exist, or not) would enforce that anyway.Do as Harsh says and see how what the "scan count" and "logical reads" are. If they are not VERY low numbers then the likelihood is that you don't have the right indexes, and you need to look at the Query Plan to see what is actually being used.SET SHOWPLAN_TEXT ONGO-- ... put query here - e.g.:SET SET SHOWPLAN_TEXT OFFGO but I reckon its a waste of time if this is not your actual real-world WHERE clause, because the query plan will be different when you add in the rest of your WHERE clauseKristen
I will try the showplan suggestion.The query is actually built dynamically with varying WHERE clauses, which is why I'm trying to optimize the "core" of it in the hopes that it will speed up the end product. |
 |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2007-05-10 : 12:18:34
|
| It may seem logical to show us the simplified WHERE clause, but in reality we may be able to help you get better performance if you show us the complete clause. Just going by the Statistics IO profile it appears the tables are relatively small. Some things you might try are:1. Moving WHERE clause conditions into the JOIN (if possible).2. If you have BIT columns and you are looking for a 1 or 0 make sure to cast. For example: CAST(0 AS BIT)3. Try using derived tables to limit the result set, again if possible.If you provide a little more information, like table size and the actual query you want to run we might be able to help more.-Ryan |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 12:33:00
|
quote: Originally posted by LampreyIf you provide a little more information, like table size and the actual query you want to run we might be able to help more.-Ryan
The tables generally have 100-500K rows. As I said, the query is built dynamically, so the where clauses will vary. Here's one full example:SELECT DISTINCT m.EntityID, m.Member, a.AddressID, MemberTypes.Description AS MemberType, m.StartDate, m.EndDate, m.ActiveStatus, i.PrimaryOrganizationName AS PrimaryRelation FROM Membership mINNER JOIN Entities e ON m.EntityID=e.EntityIDINNER JOIN Individuals i ON e.EntityID=i.IndividualIDLEFT JOIN MemberTypes ON m.MemberTypeID=MemberTypes.MemberTypeIDLEFT JOIN Addresses a ON m.EntityID=a.EntityIDLEFT JOIN AddressTypes at ON a.AddressTypeID=at.AddressTypeIDWHERE m.AssociationID=@AssociationID AND e.DeletedEntity<>1 AND (a.AddressTypeID=COALESCE((SELECT PreferredAddressTypeID WHERE PreferredAddressTypeID <> 0), (SELECT TOP 1 a2.AddressTypeID FROM Addresses a2 INNER JOIN AddressTypes at2 ON a2.AddressTypeID=at2.AddressTypeID WHERE EntityID=e.EntityID ORDER BY DisplayOrder ASC)) OR a.AddressTypeID IS NULL) Retrieving the full set takes about 24 seconds and retrieving just the top 10 takes 3 seconds.Any suggestions? |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-05-10 : 13:21:03
|
"The query is actually built dynamically with varying WHERE clauses"I'll take a small bet that you will be better off using a parameterized query (if you aren;t already). That will improve the chance that your the Plan for your query is cached.Hopefully the language you are using supports calling parameterized queries easily, but if not you can do so using sp_ExecuteSQL (but you will need to mess about doubling quotes etc.)Ah! I've just spotted WHERE m.AssociationID=@AssociationID so hopefully you are using a parameterized query.What does the SET STATISTICS IO ON setting show for the above example-query??Kristen |
 |
|
|
dinakar
Master Smack Fu Yak Hacker
2507 Posts |
Posted - 2007-05-10 : 13:25:33
|
| If the order of records doesnt mattet try using SET ROWCOUNT 10 instead of TOP 10. It will be much faster. The way TOP works is, it first gets the results into tempdb then sorts your data according to your ORDER BY, then picks the TOP X. If you use SET ROWCOUNT, its more like EXISTS, where it just picks the first 10 records it finds matching your conditions. So TOP X processing slow should be expected.Dinakar NethiSQL Server MVP************************Life is short. Enjoy it.************************http://weblogs.sqlteam.com/dinakar/ |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 13:40:58
|
quote: Originally posted by KristenI'll take a small bet that you will be better off using a parameterized query (if you aren;t already). That will improve the chance that your the Plan for your query is cached.Hopefully the language you are using supports calling parameterized queries easily, but if not you can do so using sp_ExecuteSQL (but you will need to mess about doubling quotes etc.)Ah! I've just spotted WHERE m.AssociationID=@AssociationID so hopefully you are using a parameterized query.What does the SET STATISTICS IO ON setting show for the above example-query??Kristen
Yes, I'm using sp_executesql.For the full query that returns 2854 rows, I get this:Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Addresses'. Scan count 1, logical reads 995, physical reads 3, read-ahead reads 991, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Entities'. Scan count 1, logical reads 743, physical reads 3, read-ahead reads 739, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Individuals'. Scan count 1, logical reads 2204, physical reads 3, read-ahead reads 2201, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Membership'. Scan count 1, logical reads 1092, physical reads 3, read-ahead reads 1088, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Using TOP 10:Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Addresses'. Scan count 11, logical reads 188, physical reads 1, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Entities'. Scan count 0, logical reads 362, physical reads 2, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Individuals'. Scan count 0, logical reads 407, physical reads 2, read-ahead reads 3, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Membership'. Scan count 1, logical reads 94, physical reads 2, read-ahead reads 465, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 13:42:44
|
quote: Originally posted by dinakar If the order of records doesnt mattet try using SET ROWCOUNT 10 instead of TOP 10.
Both techniques seem to take about 2 seconds. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-05-10 : 14:34:25
|
"Yes, I'm using sp_executesql"Damn! That's 99% of the speed improvement I was hoping to help you with GONE!Its a common "issue" with this sort of dynamic SQL to wind up with:LEFT JOIN Addresses a ON m.EntityID=a.EntityID...AND (a.AddressTypeID= ... (could be deliberate of course!) which is forcing an INNER join.The most strain on these guys seems to be the 'Addresses' table:eitherScan count 1, logical reads 995orScan count 11, logical reads 188so worth checking that you have and index on Addresses.EntityID, and ideally also "covering" AddressTypeID.It would be worth posting the query plan, from SET SHOWPLAN_TEXT, for this same query please.Kristen |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 14:50:33
|
Actually, I was wrong. This query is executed with EXEC and then there's a second query (not relevant here) that further processes those results via sp_executesql. I don't remember offhand why EXEC is used rather than sp_executesql in this case. Is there a significant performance difference between the two?The Addresses table does have non-unique, non-clustered indexes on EntityID and AddressTypeID.Here's the query plan: |--Stream Aggregate(GROUP BY:([m].[EntityId]) DEFINE:([m].[Member]=ANY([TestTEMP].[dbo].[Membership].[Member] as [m].[Member]), [m].[StartDate]=ANY([TestTEMP].[dbo].[Membership].[StartDate] as [m].[StartDate]), [m].[EndDate]=ANY([TestTEMP].[dbo].[Membership].[EndDate] as [m].[EndDate]), [m].[ActiveStatus]=ANY([TestTEMP].[dbo].[Membership].[ActiveStatus] as [m].[ActiveStatus]))) |--Nested Loops(Left Semi Join, OUTER REFERENCES:([e].[EntityId], [i].[PreferredAddressTypeId], [a].[AddressTypeId])) |--Nested Loops(Inner Join, PASSTHRU:([IsBaseRow1010] IS NULL), OUTER REFERENCES:([a].[AddressId])) | |--Nested Loops(Left Outer Join, OUTER REFERENCES:([m].[EntityId], [Expr1022]) WITH ORDERED PREFETCH) | | |--Nested Loops(Inner Join, OUTER REFERENCES:([i].[IndividualId], [Expr1021]) WITH ORDERED PREFETCH) | | | |--Nested Loops(Inner Join, OUTER REFERENCES:([m].[EntityId], [Expr1020]) WITH ORDERED PREFETCH) | | | | |--Clustered Index Scan(OBJECT:([TestTEMP].[dbo].[Membership].[PK_Membership] AS [m]), WHERE:([TestTEMP].[dbo].[Membership].[AssociationId] as [m].[AssociationId]=(1)) ORDERED FORWARD) | | | | |--Clustered Index Seek(OBJECT:([TestTEMP].[dbo].[Individuals].[PK_Individuals] AS [i]), SEEK:([i].[IndividualId]=[TestTEMP].[dbo].[Membership].[EntityId] as [m].[EntityId]) ORDERED FORWARD) | | | |--Clustered Index Seek(OBJECT:([TestTEMP].[dbo].[Entities].[PK_Entities] AS [e]), SEEK:([e].[EntityId]=[TestTEMP].[dbo].[Individuals].[IndividualId] as [i].[IndividualId]), WHERE:([TestTEMP].[dbo].[Entities].[DeletedEntity] as [e].[DeletedEntity]<>(1)) ORDERED FORWARD) | | |--Index Seek(OBJECT:([TestTEMP].[dbo].[Addresses].[IX_Addresses_1] AS [a]), SEEK:([a].[EntityId]=[TestTEMP].[dbo].[Membership].[EntityId] as [m].[EntityId]) ORDERED FORWARD) | |--Clustered Index Seek(OBJECT:([TestTEMP].[dbo].[Addresses].[PK_Address] AS [a]), SEEK:([a].[AddressId]=[TestTEMP].[dbo].[Addresses].[AddressId] as [a].[AddressId]) LOOKUP ORDERED FORWARD) |--Concatenation |--Filter(WHERE:(STARTUP EXPR([TestTEMP].[dbo].[Addresses].[AddressTypeId] as [a].[AddressTypeId] IS NULL))) | |--Constant Scan |--Filter(WHERE:([TestTEMP].[dbo].[Addresses].[AddressTypeId] as [a].[AddressTypeId]=CASE WHEN [TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId] IS NOT NULL THEN [TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId] ELSE [TestTEMP].[dbo].[Addresses].[AddressTypeId] as [a2].[AddressTypeId] END)) |--Nested Loops(Left Outer Join, PASSTHRU:([TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId] IS NOT NULL)) |--Nested Loops(Left Outer Join, PASSTHRU:(IsFalseOrNull [TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId] IS NOT NULL)) | |--Nested Loops(Left Outer Join) | | |--Constant Scan | | |--Compute Scalar(DEFINE:([i].[PreferredAddressTypeId]=[TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId])) | | |--Filter(WHERE:(STARTUP EXPR([TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId]<>(0)))) | | |--Constant Scan | |--Compute Scalar(DEFINE:([i].[PreferredAddressTypeId]=[TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId])) | |--Filter(WHERE:(STARTUP EXPR([TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId]<>(0)))) | |--Constant Scan |--Sort(TOP 1, ORDER BY:([at2].[DisplayOrder] ASC)) |--Nested Loops(Inner Join, OUTER REFERENCES:([a2].[AddressTypeId])) |--Nested Loops(Inner Join, OUTER REFERENCES:([a2].[AddressId])) | |--Index Seek(OBJECT:([TestTEMP].[dbo].[Addresses].[IX_Addresses_1] AS [a2]), SEEK:([a2].[EntityId]=[TestTEMP].[dbo].[Entities].[EntityId] as [e].[EntityId]) ORDERED FORWARD) | |--Clustered Index Seek(OBJECT:([TestTEMP].[dbo].[Addresses].[PK_Address] AS [a2]), SEEK:([a2].[AddressId]=[TestTEMP].[dbo].[Addresses].[AddressId] as [a2].[AddressId]) LOOKUP ORDERED FORWARD) |--Clustered Index Seek(OBJECT:([TestTEMP].[dbo].[AddressTypes].[PK_AddressTypes] AS [at2]), SEEK:([at2].[AddressTypeId]=[TestTEMP].[dbo].[Addresses].[AddressTypeId] as [a2].[AddressTypeId]) ORDERED FORWARD) |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-05-10 : 15:00:44
|
| "|--Filter(WHERE:([TestTEMP].[dbo].[Addresses].[AddressTypeId] as [a].[AddressTypeId]=CASE WHEN [TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId] IS NOT NULL THEN [TestTEMP].[dbo].[Individuals].[PreferredAddressTypeId] as [i].[PreferredAddressTypeId] ELSE [TestTEMP].[dbo].[Addresses].[AddressTypeId] as [a2].[AddressTypeId] END))"I reckon you could leverage that one a bit by adding Addresses.AddressTypeId to the appropriate index which could Cover that situation (which looks to be Addresses.IX_Addresses_1)The Membership table also seems to be struggling because it doesn't have an index on Membership.AssociationId (or it isn't sufficiently selective)But I reckon those are the obvious "low fruit", anything beyond that is probably going to be Wacky - HINTs in the query or manufactured column data indexed to favour this specific query.Kristen |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2007-05-10 : 15:01:43
|
| What are you trying to do with your joins to addresses and addresstypes? That is definitely causing some of your problems. What logic are you trying to apply there? Explain it in plain english. There is definitely a better what to handle it than what you are trying to do there ... not only that, but since you are doing a left outer join there but then referencing the outer tables in your WHERE clause, you are either turning it into an INNER JOIN or not getting the results are you hoping for.Create table statements and some sample data would be helpful as well.- Jeffhttp://weblogs.sqlteam.com/JeffS |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2007-05-10 : 15:07:07
|
"since you are doing a left outer join there but then referencing the outer tables in your WHERE clause,"Jeff: That's a side effect of the OP using dynamic SQL to generate the query. It may even be a desirable side-effect! in my experience the Optimiser spots such things and sorts them out intelligently. (Having said that I do NOT code like that, so my only experience is of 3rd part code Ive had to review )Kristen |
 |
|
|
jsmith8858
Dr. Cross Join
7423 Posts |
Posted - 2007-05-10 : 15:08:55
|
| I doubt it ... even if it is generated, there is no reason to try to "join" like that, with that big mess in the WHERE clause. if he provides the logic that the join is doing, then we can simply re-write it properly...- Jeffhttp://weblogs.sqlteam.com/JeffS |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 15:17:16
|
| In this example, it's trying to get the "preferred" address type, which is either explicitly defined in the Individuals table or is determined by the first populated address type in a defined sort order (DisplayOrder). Or the address type can be null if there are no addresses for the entity or the type no longer exists. |
 |
|
|
schmidtg
Starting Member
36 Posts |
Posted - 2007-05-10 : 15:22:42
|
| Looking at it again, I think I can do without the left join on AddressTypes. Removing it shaved the execution time down a little bit. I will try the index modifications next. |
 |
|
|
Next Page
|
|
|
|
|