Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
jonathans
Starting Member
40 Posts |
Posted - 2008-04-23 : 04:05:06
|
| Hey GuysI have a huge speed issue on one or two of my SQL Tables. I have included the basic design below.Structure Id ParentId NameGroup Id ParentId Name WeightProducts Id NameStructureProducts StructureId ProductId ImportedStructureGroups StructureId GroupIdGroupProducts GroupId ProductIdAnswerDates Id AssessmentDateScores <-- This table is the slow table AnswerDateId StructureId GroupId (nullable) ProductId (nullable) Score >= 0 && <= 100Ok, Structures are the start of everything. Structures, have children. If a group/product is Linked to a parent or child structure then that group/product is visible along the structure tree flow path. Groups, like structure have children. And also like structures, if a group is given a product, then that product is visible through the structure tree flow path.Example: Earth [Structure] - Asia [Structure] --- China [Structure] --- Japan [Structure] ----- Computer Stuff [Group] ------- Desktops [Group] ------- Servers [Group] ------- Laptops [Group] --------- HP [Product] --------- Dell [Product] --------- Fujitsu [Product] - Europe [Structure] --- Germany [Structure] ----- Berlin [Structure] --- Italy [Structure] ----- Rome [Structure] ----- Venice [Structure] - America [Structure] --- United States of America [Structure] ----- New York [Structure] ------- Computer Stuff [Group] --------- Desktops [Group] --------- Servers [Group] --------- Laptops [Group] ----------- HP [Product] ----------- Dell [Product] ------- Home Stuff [Group] --------- Kitchen Stuff [Group] --------- bedroom Stuff [Group] ----- Washington [Structure] ------- Computer Stuff [Group] --------- Desktops [Group] --------- Servers [Group] --------- Laptops [Group] ----------- HP [Product] ----------- Dell [Product] ----------- Acer [Product] ------- Home Stuff [Group] --------- Kitchen Stuff [Group] --------- bedroom Stuff [Group] ----- Chicago [Structure] ------- Computer Stuff [Group] --------- Desktops [Group] --------- Servers [Group] --------- Laptops [Group] ------- Home Stuff [Group] --------- Kitchen Stuff [Group] --------- bedroom Stuff [Group] - Africa [Structure] --- South Africa [Structure] ----- Johannesburg [Structure] ------- Computer Stuff [Group] --------- Desktops [Group] --------- Servers [Group] --------- Laptops [Group] ----------- Acer [Product] ------- Home Stuff [Group] --------- Kitchen Stuff [Group] --------- bedroom Stuff [Group] ----- Durban [Structure] ----- Capte Town [Structure] - Australasia [Structure]So the initial steps that happen (with regards to Scoring) are as follows.1. Insert root score (which would be for a structure, a group, an answer date and either a product or no product2. Score the next group up along the treeview, using the scores for the groups at the same level as the original group (0 score if no score exists).3. Continue this till GroupTree is at root (parentid == null)4. Using the next structure up along the treeview, repeat steps 2 & 3.5. Continue steps 4 until Structuree is at root (parentid == null)ExampleScoring a product for Johannesburg Acer Laptop would go as follows1. Initial score for [Acer] product against Group [Laptop] for Johannesburg.2. Calculate Score for all products (productid = null) against Laptop for Johannesburg3. Calculate Score for [Acer] product against Group [Computer Stuff] for Johannesburg4. Calculate Score for all products against Group [computer Stuff] for Johannesburg5. Calculate score for [Acer] product against all root groups for Johannesburg5.1. Group [Comptuer Stuff] and [Home Stuff]6. Calculate score for all products against all root groups for Johannesburg6.1. Group [Comptuer Stuff] and [Home Stuff]7. Calculate score for [Acer] Product against Group Laptop for South Africa8. Calculate Score for all products (productid = null) against Laptop for South Africa9. Calculate Score for [Acer] product against Group [Computer Stuff] for South Africa10. Calculate Score for all products against Group [computer Stuff] for South Africa11. Calculate score for [Acer] product against all root groups for South Africa11.1. Group [Comptuer Stuff] and [Home Stuff]12. Calculate score for all products against all root groups for South Africa12.1. Group [Comptuer Stuff] and [Home Stuff]13. Calculate score for [Acer] Product against Group Laptop for Africa14. Calculate Score for all products (productid = null) against Laptop for Africa15. Calculate Score for [Acer] product against Group [Computer Stuff] for Africa16. Calculate Score for all products against Group [computer Stuff] for Africa17. Calculate score for [Acer] product against all root groups for Africa17.1. Group [Comptuer Stuff] and [Home Stuff]18. Calculate score for all products against all root groups for Africa18.1. Group [Comptuer Stuff] and [Home Stuff]etc. etc. etc...This basicly coveres the concept behind the basic scoring methodology. Now the methodology splits into 2. The first Methodology 1, say it should do these calculations using the Exact same date as the original scored date. (Ie. if i do a score today, only scores on today will be used in the calculations). The other, Methodology 2, says that it should do the calculations on the latest available date. (Ie. If i do a score today, only scores from today and the latest before today will be used in the calculations).Now to add another problem to this already complex process, is that each Group and each product within a structure can have either of the 2 scoring methodologies assigned to it. Also, products can only be scored against the structures and groups that they are assigned to. Ie, Acer exists in Laptop Group, in Johannesburg or South Africa or Africa, but doesnt exist in New York.Ok, so now that i've explained briefly how this scoring works, let me get to the heart of the problem. Basicly its speed (can clearly see why), though the speed issue only comes up in 1 Place. And that is where it has to look backwards for the latest available score for the required group, structure and product.For this to happen i wrote a function ALTER FUNCTION [dbo].[GetLatestAnswerDateId] ( @StructureId INT, @GroupId INT, @ProductId INT, @AnswerDateId INT )RETURNS INTASBEGIN DECLARE @Id INT DECLARE @Date DATETIME SELECT TOP 1 @Date = [Date] FROM [dbo].[AnswerDate] WHERE [Id] = ISNULL(@AnswerDateId, [Id]) ORDER BY [Date] DESC SELECT TOP 1 @Id = ad.id--gs.[AnswerDateId] FROM [dbo].[Scoring] gs INNER JOIN [dbo].[AnswerDate] ad ON ad.Id = gs.AnswerDateId WHERE [StructureId] = @StructureId AND ISNULL([GroupId], -1) = ISNULL(@GroupId, -1) AND ISNULL([ProductId], -1) = ISNULL(@ProductId, -1) AND [Date] <= @Date ORDER BY [Date] DESC RETURN @IdENDNow on small amounts of data (1000 rows or so) its quick, though that is due to the fact that the data is minimal, but on large amounts of data this function runs for along time. Specificly in the context of the following when there is 6 months of scoring data (100 000+ rows) to peruse.SELECT [StructureId], [GroupId], [AnswerDateId], [ProductId], [Score]FROM [Scoring]WHERE AnswerDateId = GetLatestAnswerDateId([Structure], [GroupId], [ProductId], null)AND [StructureId] = South AfricaAND [GroupId] = Computer StuffAND [ProductId] = AcerAny idea's on how to make this quick? or quicker?My Current runtime for calculating the 2500 base scores (totals 100 000+- rows) takes 15 hours. Though this is an initial calculation and is only supposed to be done once.Also, this calculations are all correct, so my only issue itself is the speed of the entire process.Thanks In AdanceJonathanWARNING: Running on cold coffee! |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-23 : 04:06:33
|
Use a recursive CTE. E 12°55'05.25"N 56°04'39.16" |
 |
|
|
jonathans
Starting Member
40 Posts |
Posted - 2008-04-23 : 04:09:58
|
| As said, the speed isnt around the looping so much as a single set of query (at the bottom of this post). As for the execution, i am using a SQL CLR to do the the scoring, though, the CLR kicks off a stored procedure for each methodology.Ie. CLR Procedure loops up through the trees, but at each level it kicks off another stored procedure based on the required scoring methodology (the basic query is the one at the bottom of post).WARNING: Running on cold coffee! |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-23 : 04:20:14
|
Also, instead of SELECT TOP 1 ... ORDER BY (twice) in the function, I think you can replace with the new ROW_NUMBER() built-in function. E 12°55'05.25"N 56°04'39.16" |
 |
|
|
jonathans
Starting Member
40 Posts |
Posted - 2008-04-23 : 05:19:40
|
| Could you explain to me how to use the ROW_NUMBER function to select the top 1 available date? Sorry trying to read up on it, but not getting my head around it.ThanksWARNING: Running on cold coffee! |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-23 : 05:34:25
|
ROW_NUMBER() OVER (PARTITION BY {some col(s)} ORDER BY {some datecol} DESC) AS RecIDAnd then filter for RecID = 1 to get "first" (maximum date) for each partition.It is all there in Books Online. E 12°55'05.25"N 56°04'39.16" |
 |
|
|
bfoster
Starting Member
30 Posts |
Posted - 2008-04-23 : 14:30:36
|
| I'd suggest making sure you have appropriate indexes and rewriting the query to make better use of indexes.IF (@AnswerDateId IS NULL)SELECT MAX([Date]) FROM dbo.AnswerDateELSESELECT MAX([Date]) FROM dbo.AnswerDate WHERE Id = @AnswerDateIdIF (@GroupId IS NOT NULL AND @ProductId IS NOT NULL)SELECT TOP 1 @Id = ad.idFROM [dbo].[Scoring] gsINNER JOIN [dbo].[AnswerDate] ad ON ad.Id = gs.AnswerDateIdWHERE [StructureId] = @StructureIdAND [GroupId] = @GroupIdAND [ProductId] = @ProductIdAND [Date] <= @DateORDER BY [Date] DESCELSE IF (@GroupId IS NOT NULL AND @ProductId IS NULL)-- fill in the queryELSE IF (@GroupId IS NULL AND @ProductId IS NOT NULL)-- fill in the queryELSESELECT TOP 1 @Id = ad.idFROM [dbo].[Scoring] gsINNER JOIN [dbo].[AnswerDate] ad ON ad.Id = gs.AnswerDateIdWHERE [StructureId] = @StructureIdAND [GroupId] IS NULLAND [ProductId] IS NULLAND [Date] <= @DateORDER BY [Date] DESC |
 |
|
|
|
|
|
|
|