Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
anilofar
Starting Member
9 Posts |
Posted - 2011-01-18 : 18:40:05
|
I have following two tablesDECLARE @Item TABLE ( Id int)INSERT INTO @Item VALUES (101)INSERT INTO @Item VALUES (102)INSERT INTO @Item VALUES (103)DECLARE @ItemInfo TABLE (Id int, State char, DateCreated DateTime)INSERT INTO @TaskInfo VALUES (101, 'A', '2010-12-20')INSERT INTO @TaskInfo VALUES (101, 'A', '2010-12-21')INSERT INTO @TaskInfo VALUES (101, 'B', '2010-12-23')INSERT INTO @TaskInfo VALUES (102, 'A', '2010-12-20')INSERT INTO @TaskInfo VALUES (102, 'B', '2010-12-21')INSERT INTO @TaskInfo VALUES (102, 'B', '2010-12-22')INSERT INTO @TaskInfo VALUES (103, 'B', '2010-12-20')INSERT INTO @TaskInfo VALUES (103, 'A', '2010-12-23')INSERT INTO @TaskInfo VALUES (103, 'C', '2010-12-24') What I need to find is the count of items for each state from 2010-12-20 to 2010-12-24So the result table should look something like DECLARE @Result TABLE (Date DateTime, StateA char, Count int)INSERT INTO @Result VALUES ('2010-12-20', 'A', 2)INSERT INTO @Result VALUES ('2010-12-20', 'B', 1)INSERT INTO @Result VALUES ('2010-12-20', 'C', 0)INSERT INTO @Result VALUES ('2010-12-21', 'A', 1)INSERT INTO @Result VALUES ('2010-12-21', 'B', 2)INSERT INTO @Result VALUES ('2010-12-21', 'C', 0)INSERT INTO @Result VALUES ('2010-12-22', 'A', 1)INSERT INTO @Result VALUES ('2010-12-22', 'B', 2)INSERT INTO @Result VALUES ('2010-12-22', 'C', 0)INSERT INTO @Result VALUES ('2010-12-23', 'A', 1)INSERT INTO @Result VALUES ('2010-12-23', 'B', 2)INSERT INTO @Result VALUES ('2010-12-23', 'C', 0)INSERT INTO @Result VALUES ('2010-12-24', 'A', 0)INSERT INTO @Result VALUES ('2010-12-24', 'B', 2)INSERT INTO @Result VALUES ('2010-12-24', 'C', 1)Please note that for Item 101 there is no info for the 22nd and 24th in @ItemInfo table. But the state for 101 on 22nd should be same as 21st and state on 24th should be same as 23rd. So, for any missing days, the state info for an item should be the same as day before.Right, now I am doing a while loop for each day (@current_Date) from 2010-12-20 to 2010-12-24 as outlined below. But, this is REALLY slow when I need to calculate for 3 months on about 4 million rows. Is there another way to do this?SELECT @current_Date,State,COUNT(Id) FROM( SELECT B.Id ,ISNULL( ( SELECT TOP 1 A.State FROM @ItemInfo [A] WHERE A.Id = B.Id AND A.DateCreated <= @current_Date),'') [State] FROM @Item [B])[TX]GROUP BY State Any help would be greatly appreciated. Thanks! |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-01-18 : 18:50:11
|
Something like this?DECLARE @TaskInfo TABLE (Id int, State char, DateCreated DateTime)INSERT INTO @TaskInfo VALUES (101, 'A', '2010-12-20')INSERT INTO @TaskInfo VALUES (101, 'A', '2010-12-21')INSERT INTO @TaskInfo VALUES (101, 'B', '2010-12-23')INSERT INTO @TaskInfo VALUES (102, 'A', '2010-12-20')INSERT INTO @TaskInfo VALUES (102, 'B', '2010-12-21')INSERT INTO @TaskInfo VALUES (102, 'B', '2010-12-22')INSERT INTO @TaskInfo VALUES (103, 'B', '2010-12-20')INSERT INTO @TaskInfo VALUES (103, 'A', '2010-12-23')INSERT INTO @TaskInfo VALUES (103, 'C', '2010-12-24')SELECT dates.DateCreated, states.State, COUNT(t.Id) AS CountFROM ( SELECT DISTINCT DateCreated FROM @TaskInfo ) AS datesCROSS JOIN ( SELECT DISTINCT State FROM @TaskInfo ) AS statesLEFT JOIN @TaskInfo AS t ON t.DateCreated = dates.DateCreated AND t.State = states.StateGROUP BY dates.DateCreated, states.StateORDER BY dates.DateCreated, states.State N 56°04'39.26"E 12°55'05.63" |
 |
|
|
anilofar
Starting Member
9 Posts |
Posted - 2011-01-18 : 19:04:12
|
| Thanks Peso.However, that doesn't give me the correct count. Please see above (@Result) TableYour query gives:2010-12-20 00:00:00.000 A 22010-12-20 00:00:00.000 B 12010-12-20 00:00:00.000 C 02010-12-21 00:00:00.000 A 12010-12-21 00:00:00.000 B 12010-12-21 00:00:00.000 C 02010-12-22 00:00:00.000 A 02010-12-22 00:00:00.000 B 12010-12-22 00:00:00.000 C 02010-12-23 00:00:00.000 A 12010-12-23 00:00:00.000 B 12010-12-23 00:00:00.000 C 02010-12-24 00:00:00.000 A 02010-12-24 00:00:00.000 B 02010-12-24 00:00:00.000 C 1For instance, on the 21st I should have 2010-12-21 00:00:00.000 A 1 2010-12-21 00:00:00.000 B 22010-12-21 00:00:00.000 C 0B has count of 2 here because I also need to count an item's state if it is missing for a certain day. This item's state will be same as previous day. ie: On the 21st, Items 102 and 103 were in state B.I hope I have explained it clearly. |
 |
|
|
jcelko
Esteemed SQL Purist
547 Posts |
Posted - 2011-01-18 : 19:05:13
|
| Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. If you know how, follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. CREATE TABLE Vague_Items(vague_id INTEGER NOT NULL PRIMARY KEY);INSERT INTO Vague_Items VALUES (101), (102), (103);CREATE TABLE Item_History (vague_id INTEGER NOT NULL, REFERENCES Vague_Items (vague_id), creation_date DATE NOT NULL, foobar_state CHAR(1) NOT NULL CHECK(State IN ('A', 'B', 'C')), PRIMARY KEY (vague_id, creation_date)); -- guessing hereINSERT INTO Item_History (vague_id, foobar_state, creation_date)VALUES (101, 'A', '2010-12-20'), (101, 'A', '2010-12-21'), (101, 'B', '2010-12-23'), (102, 'A', '2010-12-20'), (102, 'B', '2010-12-21'), (102, 'B', '2010-12-22'), (103, 'B', '2010-12-20'), (103, 'A', '2010-12-23'), (103, 'C', '2010-12-24');>> What I need to find is the count of items for each state from 2010-12-20 to 2010-12-24 <<SELECT creation_date, SUM (CASE WHEN foobar_state = 'A' THEN 1 ELSE 0 END AS a_cnt, SUM (CASE WHEN foobar_state = 'A' THEN 1 ELSE 0 END AS b_cnt, SUM (CASE WHEN foobar_state = 'A' THEN 1 ELSE 0 END AS c_cnt FROM Item_History WHERE creation_date BETWEEN '2010-12-20' AND '2010-12-24' GROUP BY creation_date;--CELKO--Books in Celko Series for Morgan-Kaufmann PublishingAnalytics and OLAP in SQLData and Databases: Concepts in Practice Data, Measurements and Standards in SQLSQL for SmartiesSQL Programming Style SQL Puzzles and Answers Thinking in SetsTrees and Hierarchies in SQL |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-01-18 : 19:31:04
|
I have a couple of suggestions.But I need to know the ratio of missing records vs total number of records. N 56°04'39.26"E 12°55'05.63" |
 |
|
|
anilofar
Starting Member
9 Posts |
Posted - 2011-01-18 : 20:10:19
|
| I can't really give an exact number here since some items are updated almost daily whereas other items may not be updated for days. But, on average for a given week, an item maybe updated two or three times a week. Additional info, that may be helpful. In one instance, I am working with about 200000 distinct items and this will have close to approx 3 million rows of data for say 3 months.I hope this helps. |
 |
|
|
dineshrajan_it
Posting Yak Master
217 Posts |
Posted - 2011-01-20 : 05:27:19
|
| Hi,DECLARE @Item TABLE ( Id int)INSERT INTO @Item VALUES (101)INSERT INTO @Item VALUES (102)INSERT INTO @Item VALUES (103)DECLARE @ItemInfo TABLE (Id int, State char, DateCreated DateTime)INSERT INTO @ItemInfo VALUES (101, 'A', '2010-12-20')INSERT INTO @ItemInfo VALUES (101, 'A', '2010-12-21')INSERT INTO @ItemInfo VALUES (101, 'B', '2010-12-23')INSERT INTO @ItemInfo VALUES (102, 'A', '2010-12-20')INSERT INTO @ItemInfo VALUES (102, 'B', '2010-12-21')INSERT INTO @ItemInfo VALUES (102, 'B', '2010-12-22')INSERT INTO @ItemInfo VALUES (103, 'B', '2010-12-20')INSERT INTO @ItemInfo VALUES (103, 'A', '2010-12-23')INSERT INTO @ItemInfo VALUES (103, 'C', '2010-12-24')select t.*from( select [state] = isnull(info1.State,info2.state) ,[DateCreated] = isnull(info1.DateCreated,info2.DateCreated) ,sum(case when info1.State IS null then 0 else 1 end) as 'count' --,info2.* from @ItemInfo info1 right outer join ( select distinct datecreated,t.state from @ItemInfo cross apply ( select 'state'='A' union select 'B' union select 'C' )t )info2 on info1.datecreated=info2.datecreated and info1.state=info2.state group by info1.datecreated,info1.state,info2.datecreated,info2.state) torder by t.DateCreated,t.stateIam a slow walker but i never walk back |
 |
|
|
anilofar
Starting Member
9 Posts |
Posted - 2011-01-20 : 14:27:20
|
Thanks for the reply.However, that gives me the same results as what Peso's query does. Looking back at the code that I posted. Looks to me like the correlated subquery inside the while loop is the one that is slowing things down. Especially when running this on 300000+ items for 3 months+.Is there another way to write the following query, possibly using joins that will improve performance?DECLARE @count INT = 1;WHILE @count > 0BEGIN -- @DateTable is used as an anchor to find all the -- dates that is to be used in the trendline chart SELECT @current_Date = [Date] FROM @DateTable ORDER BY [Date] SELECT @current_Date,State,COUNT(Id) FROM ( SELECT B.Id ,ISNULL( ( SELECT TOP 1 A.State FROM @ItemInfo [A] WHERE A.Id = B.Id AND A.DateCreated <= @current_Date ),'') [State] FROM @Item [B] )[TX] GROUP BY State DELETE FROM @DateTable WHERE [Date] = @current_Date; SELECT @count = COUNT(*) FROM @DateTable;END |
 |
|
|
|
|
|
|
|