Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2012 Forums
 Transact-SQL (2012)
 How many users were active during time interval

Author  Topic 

QuentinL
Starting Member

11 Posts

Posted - 2013-04-03 : 12:27:16
Hi everyone ,

I think this one is a tough one.

I need to know how many user where logged during a time period based on a activity table

In the activity table the user’s status is stored with a timestamp.

Here is the Table
UserID | State | TimeStamp
1 | 4 | 2013-03-29 09:00:39
1 | 3 | 2013-03-29 12:00:00
2 | 4 | 2013-03-29 10:30:00
2 | 3 | 2013-03-29 12:00:00

State = 4 means User has logged in and 3 means agent has logged out

For example between 10h and 11h the result would be 1.5 because “User 2” logged in the middle of this interval and the user 1 was logged in during the whole period.

So for each user I have to sum the time between the logged in state and the logged out state then to divide it by the given period in order to get the user activity for this period, and then make the sum for all users…

Considering that the user status may have or not change during the given interval, I guess I should work with rows for the given interval plus the first row just before the interval because I need to know the state of the user before the beginning of the interval.

Am I clear?!?

Does this sound possible to you?

NB : I've uploaded a csv file containing sample data to give you a "quick" start if needed http://www.sendspace.com/file/tcvy6v

Thank you for your help!

Regards,

Quentin

UnemployedInOz
Yak Posting Veteran

54 Posts

Posted - 2013-04-03 : 20:09:07
Create Table #TEST
(UserID INT,State INT,TimeStamp DATETIME)
-- truncate table #Test
iNSERT INTO #TEST VALUES (1, 4, '2013-03-29 09:00:39'),
(1, 3, '2013-03-29 12:00:00'),
(2, 4, '2013-03-29 10:30:00'),
(2, 3, '2013-03-29 12:00:00'),
(1, 4, '2013-04-29 09:00:39'),
(1, 3, '2013-04-29 12:00:00'),
(3, 4, '2013-03-29 11:00:39')

Declare @RangeFrom datetime,
@RangeTo datetime

SELECT @RangeFrom = '2013-03-29 10:00:00.000',
@RangeTo = '2013-03-29 11:00:00.000';

With src as (
SELECT a.UserID,a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp)
from #test b
where a.UserID = b.UserID and
a.TimeStamp <= b.TimeStamp and
b.State = 3 and
b.TimeStamp >= @RangeFrom),@RangeTo) as TTo
FROM #TEST a
WHERE State = 4 and a.TimeStamp < @RangeTo)
SELECT SUM(Datediff(s,TFrom2,TTo2))/3600.0
FROM (
Select UserID,
case when TFrom < @RangeFrom then @RangeFrom else TFrom end TFrom2,
case when TTo > @RangeTo then @RangeTo else TTo end TTo2
from src) as z;
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-03 : 21:13:10
Wahoow !!!
Your query looks very good!
Tough for me to understand it...

It works great with the test values, but looks strange on my real data...

I've just finished my dirty query using many temporary tables, my result
seems consistents but quite different from your query.

It's more than 3am now for me, so I won't go further for today.

I'll give another try tomorrow to understand your query and the results it gives, and probably post here my dirty query when I'll have added some comments.

Anyways, big thanks to you!

Hope you'll be around there tomorrow

Quentin
Go to Top of Page

UnemployedInOz
Yak Posting Veteran

54 Posts

Posted - 2013-04-03 : 21:43:20
I start off by creating rows of information consisting of UserID, StartTime and EndTime together.
To find the Endtime that goes with the Starttime, you look for the first Endtime for that user after the starttime. Hence the min(b.TimeStamp). If there is no endtime in your table, we assume that the user is still connected i.e. ISNULL(min(b.TimeStamp),@RangeTo) and we use the @RangeTo instead.
Once we have the rows of information which is created within the WITH statement and allocated to table name src we can use this information. If the starttime is before the @RangeFrom time we then use the @RangeFrom time and not the startTime becuase we are only interest in this @RangeFrom period. By the same token, if the @RangeTo is before the EndTime, we then want to use the @RangeTo time as we are not interested in anything after that. Therefore with the new times I find the number of seconds between the two times and sum them up. I then convert back to hours to get the answer.
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-04 : 08:43:35
Thanks for your explanations, it helped me get a better understanding of your query.

If think the problem occurs when I have previous rows from previous days

Let’s use to following values about only 1 user over a few days

Using the following range:
SELECT @RangeFrom = '2013-03-29 10:00:00.000',
@RangeTo = '2013-03-29 11:30:00.000';
Which is more than 1h; I replace the division by 3600.0 by
CONVERT(float,DATEDIFF(s,@RangeFrom,@RangeTo)), casting it as a float in order to get the result as a float

Create Table #TEST
(UserID INT,State INT,TimeStamp DATETIME)
-- truncate table #Test
iNSERT INTO #TEST VALUES
(1, 3, '2013-03-27 00:00:01'),
(1, 4, '2013-03-27 08:33:22'),
(1, 3, '2013-03-27 12:11:26'),
(1, 4, '2013-03-27 14:07:16'),
(1, 3, '2013-03-27 18:00:45'),
(1, 3, '2013-03-28 00:00:02'),
(1, 4, '2013-03-28 08:04:55'),
(1, 3, '2013-03-28 12:03:41'),
(1, 4, '2013-03-28 13:23:05'),
(1, 3, '2013-03-28 18:04:13'),
(1, 3, '2013-03-29 00:00:02'),
(1, 4, '2013-03-29 08:10:32'),
(1, 3, '2013-03-29 10:10:32'),
(1, 4, '2013-03-29 10:40:32'),
(1, 3, '2013-03-29 11:51:03'),
(1, 4, '2013-03-29 12:54:10'),
(1, 3, '2013-03-29 17:47:47')

The nested Select:

With src as (
SELECT
a.UserID,
a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp) from #test b where a.UserID = b.UserID and a.TimeStamp <= b.TimeStamp and b.State = 3 and b.TimeStamp >= @RangeFrom),@RangeTo) as TTo FROM #TEST a WHERE State = 4 and a.TimeStamp < @RangeTo)
SELECT * FROM SRC

Gives the following:

1 2013-03-27 08:33:22.000 2013-03-29 10:10:32.000
1 2013-03-27 14:07:16.000 2013-03-29 10:10:32.000
1 2013-03-28 08:04:55.000 2013-03-29 10:10:32.000
1 2013-03-28 13:23:05.000 2013-03-29 10:10:32.000
1 2013-03-29 08:10:32.000 2013-03-29 10:10:32.000
1 2013-03-29 10:40:32.000 2013-03-29 11:51:03.000

Then we correct the TFrom and the TTo

With src as (
SELECT a.UserID,a.TimeStamp as TFrom, ISNULL((Select min(b.TimeStamp) from #test b where a.UserID = b.UserID and a.TimeStamp <= b.TimeStamp and b.State = 3 and b.TimeStamp >= @RangeFrom),@RangeTo) as TTo
FROM #TEST a
WHERE State = 4 and a.TimeStamp < @RangeTo)
Select UserID, case when TFrom < @RangeFrom then @RangeFrom else TFrom end TFrom2, case when TTo > @RangeTo then @RangeTo else TTo end TTo2 from src

This gives the following:

1 2013-03-29 10:00:00.000 2013-03-29 10:10:32.000
1 2013-03-29 10:00:00.000 2013-03-29 10:10:32.000
1 2013-03-29 10:00:00.000 2013-03-29 10:10:32.000
1 2013-03-29 10:00:00.000 2013-03-29 10:10:32.000
1 2013-03-29 10:00:00.000 2013-03-29 10:10:32.000
1 2013-03-29 10:40:32.000 2013-03-29 11:30:00.000

The final result is 1.13…. When the value for 1 user cannot be greater than 1

The correct Result would be obtained using the two last Rows

1 2013-03-29 10:00:00.000 2013-03-29 10:10:32.000
1 2013-03-29 10:40:32.000 2013-03-29 11:30:00.000

Which would return 0.66…. which looks Ok as the user was connected around 60min over a 90min range
To work properly we need the first row before the @RangeFrom to determine to initial state of the user.
So I guess min (b.TimeStamp) would need an extra condition on the where clause to limit the rows returned by the nested select statement…

I really appreciate your help !!

Quentin
Go to Top of Page

UnemployedInOz
Yak Posting Veteran

54 Posts

Posted - 2013-04-04 : 17:52:16

With src as (
SELECT a.UserID,a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp)
from #test b
where a.UserID = b.UserID and
a.TimeStamp <= b.TimeStamp and
b.State = 3),@RangeTo) as TTo
FROM #TEST a
WHERE State = 4 and a.TimeStamp < @RangeTo)
SELECT SUM(Datediff(s,TFrom2,TTo2))/3600.0
FROM (
Select UserID,
case when TFrom < @RangeFrom then @RangeFrom else TFrom end TFrom2,
case when TTo > @RangeTo then @RangeTo else TTo end TTo2
from src
) as z WHERE TFrom2 < TTo2;
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-05 : 07:41:39
Fantastic !!

Your query gives the same result as my dirty nested cursor/fetch 200 lines query

If you agree let me take my need one step further.

I would like to have a GROUP BY clause grouping result by hour or half hour...

I already used something like

Select Count(*), DateAdd(Minute, 30 * (DateDiff(Minute, 0, SomeDate) / 30), 0)
From #Temp
Group By DateAdd(Minute, 30 * (DateDiff(Minute, 0, SomeDate) / 30), 0)

in a previous query to count rows by half hour...

So focusing on the 03/29 the result I would like would be
...|8h-8h30 | 8h30-9h | 9h-9h30 | 9h30-10h | 10h-10h30
| 0.648...| 1 | 1 | 1 | 0.648...

I could use a while loop and process your request 48 times to get the job done...

Have you got a better idea?

Thank you for everything !!

Quentin
Go to Top of Page

UnemployedInOz
Yak Posting Veteran

54 Posts

Posted - 2013-04-05 : 18:11:35
quote:
Originally posted by QuentinL

Fantastic !!

So focusing on the 03/29 the result I would like would be
...|8h-8h30 | 8h30-9h | 9h-9h30 | 9h30-10h | 10h-10h30
| 0.648...| 1 | 1 | 1 | 0.648...

Quentin


These results are double what they should be...


Create Table #TEST
(UserID INT,State INT,TimeStamp DATETIME)

Create Table #Range
(RangeFrom datetime,
RangeTo datetime)

-- truncate table #Test
-- truncate table #Range
iNSERT INTO #TEST VALUES
(1, 3, '2013-03-27 00:00:01'),
(1, 4, '2013-03-27 08:33:22'),
(1, 3, '2013-03-27 12:11:26'),
(1, 4, '2013-03-27 14:07:16'),
(1, 3, '2013-03-27 18:00:45'),
(1, 3, '2013-03-28 00:00:02'),
(1, 4, '2013-03-28 08:04:55'),
(1, 3, '2013-03-28 12:03:41'),
(1, 4, '2013-03-28 13:23:05'),
(1, 3, '2013-03-28 18:04:13'),
(1, 3, '2013-03-29 00:00:02'),
(1, 4, '2013-03-29 08:10:32'),
(1, 3, '2013-03-29 10:10:32'),
(1, 4, '2013-03-29 10:40:32'),
(1, 3, '2013-03-29 11:51:03'),
(1, 4, '2013-03-29 12:54:10'),
(1, 3, '2013-03-29 17:47:47')

Declare @RangeFrom datetime,
@RangeTo datetime


SELECT @RangeFrom = '2013-03-29 08:00:00.000',
@RangeTo = '2013-03-29 23:30:00.000';

While @RangeFrom < @Rangeto
BEGIN
INSERT INto #Range Values (@RangeFrom,DateAdd(mi,30,@RangeFrom))
SELECT @RangeFrom = DateAdd(mi,30,@RangeFrom)
END;
-- select * from #test
-- select * from #Range

With src as (
SELECT RangeFrom,RangeTo,a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp)
from #test b
where a.UserID = b.UserID and
a.TimeStamp <= b.TimeStamp and
b.State = 3),RangeTo) as TTo
FROM #TEST a CROSS JOIN #Range
WHERE State = 4 and a.TimeStamp < RangeTo
)

SELECT RangeFrom,RangeTo,SUM(Datediff(s,TFrom2,TTo2))/3600.0
FROM (
Select RangeFrom,RangeTo,
case when TFrom < RangeFrom then RangeFrom else TFrom end TFrom2,
case when TTo > RangeTo then RangeTo else TTo end TTo2
from src
) as z
WHERE TFrom2 < TTo2
Group By RangeFrom,RangeTo;
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-06 : 20:21:20
Great!!

I made a little mistake in the last message about the expected results.

The correct ones are:
2013-03-29 08:00:00.000 2013-03-29 08:30:00.000 0.648888
2013-03-29 08:30:00.000 2013-03-29 09:00:00.000 1.000000
2013-03-29 09:00:00.000 2013-03-29 09:30:00.000 1.000000
2013-03-29 09:30:00.000 2013-03-29 10:00:00.000 1.000000
2013-03-29 10:00:00.000 2013-03-29 10:30:00.000 0.351111
2013-03-29 10:30:00.000 2013-03-29 11:00:00.000 0.648888

The result is not doubled as they need to be divided by 1800.0 instead of 3600.0

Anyway your query is just great, I wouldn't have thought of making a JOIN on a table containing the half hour ranges.

I we can go further, I'm trying to optimize the query.

I think that replacing the temp table by the following should help:

Declare @RangeFrom datetime,
@RangeTo datetime
SELECT @RangeFrom = '2013-03-25 08:00:00.000',
@RangeTo = '2013-03-29 23:30:00.000';
;With RangeByHalfHours As
(
Select @RangeFrom As RangeFrom,Dateadd(mi,30,@RangeFrom) As RangeTo
Union All
Select DateAdd(mi,30,RangeFrom),DateAdd(mi,30,RangeTo)
From RangeByHalfHours
Where RangeTo < @RangeTo
)
select * from RangeByHalfHours
Option ( Maxrecursion 0);

Is it a good idea? If yes let me explain some limitations I experience

The following works because of Maxrecursion 0, if not present there shouldn't be more than 100 half hour ranges...

In the full query below I couldn't find where to put the Maxrecursion option.

;With RangeByHalfHours As
(
Select @FromDate As RangeFrom,Dateadd(mi,30,@FromDate) As RangeTo
Union All
Select DateAdd(mi,30,RangeFrom),DateAdd(mi,30,RangeTo)
From RangeByHalfHours
Where RangeFrom < @ToDate
)
,src as (
SELECT RangeFrom,RangeTo,a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp)
from Test b
where a.UserID = b.UserID and
a.TimeStamp <= b.TimeStamp and
b.State = 3),RangeTo) as TTo
FROM Test a CROSS JOIN RangeByHalfHours
WHERE State = 4 and a.TimeStamp < RangeTo
)

SELECT RangeFrom,RangeTo,SUM(Datediff(s,TFrom2,TTo2))/1800.0 as Coeff
FROM (
Select RangeFrom,RangeTo,
case when TFrom < RangeFrom then RangeFrom else TFrom end TFrom2,
case when TTo > RangeTo then RangeTo else TTo end TTo2
from src
) as z
WHERE TFrom2 < TTo2
Group By RangeFrom,RangeTo;

Have you got an idea?

Thank you very much!

Quentin
Go to Top of Page

UnemployedInOz
Yak Posting Veteran

54 Posts

Posted - 2013-04-07 : 02:28:20
You could just replace
While @RangeFrom < @Rangeto
BEGIN
INSERT INto #Range Values (@RangeFrom,DateAdd(mi,30,@RangeFrom))
SELECT @RangeFrom = DateAdd(mi,30,@RangeFrom)
END;

with

With RangeByHalfHours As
(
Select @RangeFrom As RangeFrom,Dateadd(mi,30,@RangeFrom) As RangeTo
Union All
Select DateAdd(mi,30,RangeFrom),DateAdd(mi,30,RangeTo)
From RangeByHalfHours
Where RangeTo < @RangeTo
)
select * from RangeByHalfHours
Option ( Maxrecursion 0);


I wouldnt worry about performance as I dont think trying to do it your way will add any significant improvement
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-07 : 04:03:20
The problem in that in the full query we don't
select * from RangeByHalfHours
as we CROSS join the table RangeByHalfHours
so I can't find where to put the Maxrecursion option in the full query

Nice to see you're here during week end
Thank you
Go to Top of Page

UnemployedInOz
Yak Posting Veteran

54 Posts

Posted - 2013-04-07 : 05:56:27
Create Table #TEST
(UserID INT,State INT,TimeStamp DATETIME)

Create Table #Range
(RangeFrom datetime,
RangeTo datetime)

-- truncate table #Test
-- truncate table #Range
iNSERT INTO #TEST VALUES
(1, 3, '2013-03-27 00:00:01'),
(1, 4, '2013-03-27 08:33:22'),
(1, 3, '2013-03-27 12:11:26'),
(1, 4, '2013-03-27 14:07:16'),
(1, 3, '2013-03-27 18:00:45'),
(1, 3, '2013-03-28 00:00:02'),
(1, 4, '2013-03-28 08:04:55'),
(1, 3, '2013-03-28 12:03:41'),
(1, 4, '2013-03-28 13:23:05'),
(1, 3, '2013-03-28 18:04:13'),
(1, 3, '2013-03-29 00:00:02'),
(1, 4, '2013-03-29 08:10:32'),
(1, 3, '2013-03-29 10:10:32'),
(1, 4, '2013-03-29 10:40:32'),
(1, 3, '2013-03-29 11:51:03'),
(1, 4, '2013-03-29 12:54:10'),
(1, 3, '2013-03-29 17:47:47')

Declare @RangeFrom datetime,
@RangeTo datetime


SELECT @RangeFrom = '2013-03-29 08:00:00.000',
@RangeTo = '2013-03-29 23:30:00.000';

With RangeByHalfHours As
(
Select @RangeFrom As RangeFrom,Dateadd(mi,30,@RangeFrom) As RangeTo
Union All
Select DateAdd(mi,30,RangeFrom),DateAdd(mi,30,RangeTo)
From RangeByHalfHours
Where RangeTo < @RangeTo
)
Insert Into #Range
select * from RangeByHalfHours
Option ( Maxrecursion 0);


-- select * from #test
-- select * from #Range

With src as (
SELECT RangeFrom,RangeTo,a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp)
from #test b
where a.UserID = b.UserID and
a.TimeStamp <= b.TimeStamp and
b.State = 3),RangeTo) as TTo
FROM #TEST a CROSS JOIN #Range
WHERE State = 4 and a.TimeStamp < RangeTo
)

SELECT RangeFrom,RangeTo,SUM(Datediff(s,TFrom2,TTo2))/3600.0
FROM (
Select RangeFrom,RangeTo,
case when TFrom < RangeFrom then RangeFrom else TFrom end TFrom2,
case when TTo > RangeTo then RangeTo else TTo end TTo2
from src
) as z
WHERE TFrom2 < TTo2
Group By RangeFrom,RangeTo;
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-07 : 08:29:22
I finally succeed placing the maxrecursion option.

From what I've read :
-the] OPTION clause can be used only at the statement level
-You can not change the default value of that option inside a udf. You will have to do it in the statement referencing the udf.

So I added the option at the end of the query after the GROUP BY statement,which I'm sure I've tried yesterday but I guess I had forgot to remove the semicolon of the GROUP BY, and it works like a charm.

Declare @RangeFrom datetime,
@RangeTo datetime
SELECT @RangeFrom = '20130327 08:00:00',
@RangeTo = '20130330 12:00:00';

;WITH
RangeByHalfHours AS
(
Select @RangeFrom As RangeFrom,Dateadd(mi,30,@RangeFrom) As RangeTo
Union All
Select DateAdd(mi,30,RangeFrom),DateAdd(mi,30,RangeTo)
From RangeByHalfHours
Where RangeFrom < @RangeTo
)
,src AS
(
SELECT RangeFrom,RangeTo,a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp)
from Test b
where a.UserID = b.UserID and
a.TimeStamp <= b.TimeStamp and
b.State = 3),RangeTo) as TTo
FROM Test a CROSS JOIN RangeByHalfHours
WHERE State = 4 and a.TimeStamp < RangeTo
)

SELECT RangeFrom,RangeTo,SUM(Datediff(s,TFrom2,TTo2))/1800.0 as Coeff
FROM (
Select RangeFrom,RangeTo,
case when TFrom < RangeFrom then RangeFrom else TFrom end TFrom2,
case when TTo > RangeTo then RangeTo else TTo end TTo2
from src
) as z
WHERE TFrom2 < TTo2
Group By RangeFrom,RangeTo
Option ( Maxrecursion 0 )

As expected, if I remove the Option the query fails with the error seen previously:
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

I think it's all for the moment thanks a lot for your help !!!

Quentin
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-08 : 20:18:50
One more question please, I have not been able to include rows with value 0, how would you do this?

Thank you
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-09 : 20:24:34
There it'is

;With RangeByHalfHours As
(
Select @FromDate As RangeFrom,Dateadd(mi,30,@FromDate) As RangeTo
Union All
Select DateAdd(mi,30,RangeFrom),DateAdd(mi,30,RangeTo)
From RangeByHalfHours
Where cast(RangeTo as Date) <= @ToDate
)
,src as (
SELECT RangeFrom,RangeTo,a.TimeStamp as TFrom,
ISNULL((Select min(b.TimeStamp)
from Test b
where a.UserID = b.UserID and
a.TimeStamp <= b.TimeStamp and
b.State = 3),RangeTo) as TTo
FROM Test a CROSS JOIN RangeByHalfHours
WHERE State = 4 and a.TimeStamp < RangeTo
)
,smooth as (
Select RangeFrom,RangeTo,
case when TFrom < RangeFrom then RangeFrom else TFrom end TFrom2,
case when TTo > RangeTo then RangeTo else TTo end TTo2
from src
)
,rslt as (
SELECT smooth.RangeFrom,smooth.RangeTo,SUM(Datediff(s,TFrom2,TTo2))/1800.0 as Coeff
FROM smooth
WHERE TFrom2 < TTo2
Group By smooth.RangeFrom,smooth.RangeTo
)

SELECT * from rslt
UNION
SELECT RangeFrom, RangeTo, 0 as Coeff
FROM
RangeByHalfHours
WHERE
RangeFrom NOT IN (select RangeFrom from rslt)
Option ( Maxrecursion 0 )
Go to Top of Page

UnemployedInOz
Yak Posting Veteran

54 Posts

Posted - 2013-04-09 : 21:11:29
You may want to throw in an Order By RangeFrom
at the end...
Go to Top of Page

QuentinL
Starting Member

11 Posts

Posted - 2013-04-10 : 04:09:46
Thank you, I missed the order by clause!

Surprisingly the results are sorted by default.I don't know why... That probably lead me to miss the order by.

Furthermore, even if not sorted I wouldn't really need the Order by as the results are used in a chart, the data will be sorted by the axis.
But I would have add the order by for a better readability when using the query alone.

Thank you for spotting this
Go to Top of Page

lilyyang
Starting Member

3 Posts

Posted - 2013-07-08 : 03:43:08
unspammed
Go to Top of Page
   

- Advertisement -