Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Kristen
Test
22859 Posts |
Posted - 2006-11-13 : 02:22:32
|
I have a table with about 5,000,000 rows.PK is an identity (clustered). There is also a CreateDate column & index. Records are inserted in very-nearly date order, so the ID and CreateDate are more-or-less in the same order.CreateDate covers about 3 yearsThe ID is used often in FKs, so needs to retain the Clustered Index.There are reporting queries based on a range of CreateDate. They always results in a Clustered Index Scan of the PK.I wondered if there is a way to get the MIN / MAX ID for the date range, and thus constrain the query by that, as well as a FineTuning on CreateDate.However, I can't find a way to get the MIN / MAX ID without incurring an IndexScan I even created a CreateDate, ID index so that it was coveredEverything results in an Index Scan on the PK.The sort of figures I'm getting are:SELECT @MIN_ID = MIN(ID), @MAX_ID = MAX(ID)FROM dbo.MyTableWHERE CreateDate >= @StartDate AND CreateDate < @EndDate SQL Server Execution Times:Table 'MyTable'. Scan count 2, logical reads 128049, physical reads 0, read-ahead reads 95553.CPU time = 5906 ms, elapsed time = 23028 ms.and then the big query:[code]SELECT ... stuff ...FROM MyTableWHERE ID >= @MIN_ID AND ID <= @MAX_ID AND CreateDate >= @StartDate AND CreateDate < @EndDate[code]Table 'MyTable'. Scan count 1, logical reads 471, physical reads 0, read-ahead reads 0.CPU time = 563 ms, elapsed time = 624 ms.If I do NOT use the pre-calculated MIN/MAX ID the second big query gives:Table 'MyTable'. Scan count 1, logical reads 128515, physical reads 0, read-ahead reads 126902.CPU time = 2172 ms, elapsed time = 35841 ms.so is not really any slower.However, it does seem clear that IF I could "cheaply" get the MIN / MAX ID then the query would be really fast ...Perhaps I should resort to a crib-list table where once a day I add the cut-off ID number for that date Thanks chaps,Kristen |
|
harsh_athalye
Master Smack Fu Yak Hacker
5581 Posts |
Posted - 2006-11-13 : 03:01:50
|
How about horizontal partitioning data based on CreateDate so that you will have identical tables for each year data and your search will be narrowed down to specific year?Harsh AthalyeIndia."Nothing is Impossible" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-13 : 03:14:14
|
I think I am more in the mood for a "lookup" table CREATE TABLE DateIDs(CreateDate DATETIME, MinID INT, MaxID INT)with covering index.Peter LarssonHelsingborg, Sweden |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-11-13 : 04:17:33
|
"I think I am more in the mood for a "lookup" table"Sadly me too. I was hoping, given that I have a CreateDate index (which presumably also includes the Clustered PK's [ID]) that ought to Cover it.Hmmm ... Huzzah! I tried a LOCK HINT:SELECT @MIN_ID = MIN(ID), @MAX_ID = MAX(ID)FROM dbo.MyTable WITH (INDEX(MyCreateDateIndex))WHERE CreateDate >= @StartDate AND CreateDate < @EndDate which givesTable 'MyTable'. Scan count 1, logical reads 38, physical reads 0, read-ahead reads 0.CPU time = 9 ms, elapsed time = 9 ms.I'm always weary of HINT though, what do you think?Perhaps I ought to stick a NOLOCK in there whilst I'm bending the rules ...Kristen |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2006-11-13 : 04:40:36
|
Maybe, if the query is not mission-critical...Peter LarssonHelsingborg, Sweden |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-11-13 : 05:11:41
|
Yeah, that's the point isn't it (This table records Session data on a web site. As such CreateDate does not change, once record inserted. There are a few updates that may occur later - to set the "End Date" for the session, number of pages visited, that sort of thing)So on that basis I think it may be reasonable. But I hate overriding the Query Optimiser ... that HINT will get left in when we migrate to SQL2005, never get reviewed, and probably get to be a nuisance at some point ...... which I suppose comes back to making my own LookUp table like you suggested in the first placeKristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-13 : 05:16:58
|
what kind of exec plan does this give you?SELECT @MIN_ID = MIN(ID), @MAX_ID = MAX(ID)FROM dbo.MyTableWHERE ID >= 0 AND CreateDate >= @StartDate AND CreateDate < @EndDate Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-11-13 : 05:55:11
|
[code]StmtText------------------------------- |--Nested Loops(Inner Join) |--Stream Aggregate(DEFINE:([Expr1002]=MIN([MyTable].[ID]))) | |--Top(1) | |--Clustered Index Scan(OBJECT:([MyDB].[dbo].[MyTable].[PK_MyTable]), WHERE:([MyTable].[CreateDate]>=[@StartDate] AND [MyTable].[CreateDate]<[@EndDate]]) ORDERED FORWARD) |--Stream Aggregate(DEFINE:([Expr1003]=MAX([MyTable].[ID]))) |--Top(1) |--Clustered Index Scan(OBJECT:([MyDB].[dbo].[MyTable].[PK_MyTable]), WHERE:([MyTable].[CreateDate]>=[@StartDate] AND [MyTable].[CreateDate]<[@EndDate]]) ORDERED BACKWARD)[/code]whereas with the Hint: WITH (INDEX(MyCreateDateIndex)) I get:[code]StmtText---------------------- |--Stream Aggregate(DEFINE:([Expr1002]=MIN([MyTable].[ID]), [Expr1003]=MAX([MyTable].[ID]))) |--Index Seek(OBJECT:([MyDB].[dbo].[MyTable].[MyCreateDateIndex]), SEEK:([MyTable].[CreateDate] >= [@StartDate] AND [MyTable].[CreateDate] < @EndDate) ORDERED FORWARD)[/code]Actually I'm probably better off with two queries and force the index-seek to only span a single day for each of the MIN and MAX:[code]SELECT @MIN_ID = MIN(ID)FROM dbo.MyTable WITH (INDEX(MyCreateDateIndex))WHERE CreateDate >= @StartDate AND CreateDate < DATEADD(Day, 1, @StartDate)SELECT @MAX_ID = MAX(ID)FROM dbo.MyTable WITH (INDEX(MyCreateDateIndex))WHERE CreateDate >= DATEADD(Day, -1, @EndDate) AND CreateDate < @EndDate[/code]The first, over a date range of 2 weeks, gives:Table 'MyTable'. Scan count 1, logical reads 291, physical reads 0, read-ahead reads 289.CPU time = 172 ms, elapsed time = 207 msthe second pair of queries gives:Table 'MyTable'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 0.CPU time = 0 ms, elapsed time = 7 ms.Table 'MyTable'. Scan count 1, logical reads 72, physical reads 0, read-ahead reads 0.CPU time = 22 ms, elapsed time = 22 ms.Might be a little less efficient for queries that only span a single day, but they are rare - a week or two is the normal query periodKristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-13 : 05:59:22
|
how are your statistics on that table?do you use FULLSCAN when updating them?Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-11-13 : 06:54:10
|
Stats updated each night, and using FULLSCAN but thanks for thinking of that.Perhaps I ought to just double check that they are up to date! <fx:toddles-off>Kristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-13 : 07:00:21
|
you didn't answer my first question so i assume i didn't help adding ID > 0...Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-11-13 : 07:31:16
|
"ID >= 0"Missed it, sorry.It changes it, but still operates on the Clustered PK rather than the CreateDate index:StmtText---------------- |--Nested Loops(Inner Join) |--Stream Aggregate(DEFINE:([Expr1002]=MIN([MyTable].[ID]))) | |--Top(1) | |--Clustered Index Scan(OBJECT:([MyDB].[dbo].[MyTable].[PK_MyTable]), WHERE:([MyTable].[CreateDate]>=[@dtStart] AND [MyTable].[CreateDate]<[@dtEnd]) ORDERED FORWARD) |--Stream Aggregate(DEFINE:([Expr1003]=MAX([MyTable].[ID]))) |--Top(1) |--Clustered Index Scan(OBJECT:([MyDB].[dbo].[MyTable].[PK_MyTable]), WHERE:([MyTable].[CreateDate]>=[@dtStart] AND [MyTable].[CreateDate]<[@dtEnd]) ORDERED BACKWARD)StmtText------------------------- |--Nested Loops(Inner Join) |--Stream Aggregate(DEFINE:([Expr1002]=MIN([MyTable].[ID]))) | |--Top(1) | |--Clustered Index Seek(OBJECT:([MyDB].[dbo].[MyTable].[PK_MyTable]), SEEK:([MyTable].[ID] > 0), WHERE:([MyTable].[CreateDate]>=[@dtStart] AND [MyTable].[CreateDate]<[@dtEnd]) ORDERED FORWARD) |--Stream Aggregate(DEFINE:([Expr1003]=MAX([MyTable].[ID]))) |--Top(1) |--Clustered Index Seek(OBJECT:([MyDB].[dbo].[MyTable].[PK_MyTable]), SEEK:([MyTable].[ID] > 0), WHERE:([MyTable].[CreateDate]>=[@dtStart] AND [MyTable].[CreateDate]<[@dtEnd]) ORDERED BACKWARD) Table 'MyTable'. Scan count 2, logical reads 124648, physical reads 1, read-ahead reads 125277.CPU time = 7031 ms, elapsed time = 47352 ms.Kristen |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2006-11-13 : 07:37:47
|
so i guess you'll be sticking with 2 set value queries then?Go with the flow & have fun! Else fight the flow blog thingie: http://weblogs.sqlteam.com/mladenp |
 |
|
Kristen
Test
22859 Posts |
Posted - 2006-11-13 : 07:41:28
|
"so i guess you'll be sticking with 2 set value queries then?"... which themselves have to be forced with an index hintYeah On a positive note it is lightening quick compared to before!Kristen |
 |
|
|
|
|
|
|