Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Help! Improve extremely slow DateDiff query

Author  Topic 

Indigo121
Starting Member

4 Posts

Posted - 2015-03-26 : 06:47:39
Hi
I have a nested query which runs very very slowly over large tables. Please help me refine it (if that is possible).
The query:


SELECT count(*) FROM dbo.users
WHERE User_dateCreated >= @start AND User_dateCreated <= @end
AND user_id NOT IN
(SELECT user_id from .[dbo].[UserReports]
WHERE user_id = [dbo].[UserReports].user_id
AND DATEDIFF(m, User_dateCreated, UserReport_Date) >= 1 )


General explanation: the query counts users (whose table contains a few thousand rows) created in a certain time span, and searches for those users who don't have rows under their IDs in the UserReports table (which is very large- 2 million rows) in a certain time span.

I found that if I only check reports compared to a FIXED start date (@start) and not compared to their specific creation date (User_dateCreated) in the "upper" table, the query runs MUCH faster. In other words with this line at the end-

AND DATEDIFF(m, @start, UserReport_Date) >= 1 )


Unfortunately I do need to compare to each's specific creation date. Please let me know how I can solve this- rewrite query, add indexes or whatever to tables, etc.

THANKS!!!

stepson
Aged Yak Warrior

545 Posts

Posted - 2015-03-26 : 09:12:48
1- What kind on indexes do you have ? do you have index on User_dateCreated , or on UserReport_Date?

2- change
 AND DATEDIFF(m, @start, UserReport_Date) >= 1 ) 
to a more sargable search.

AND UserReport_Date>=DATEADD(m,1,User_dateCreated)

3- try to change NOT IN in something like NOT EXISTS

NOT EXISTS
(SELECT * from .[dbo].[UserReports]
WHERE user_id = [dbo].[UserReports].user_id
AND UserReport_Date>=DATEADD(m,1,User_dateCreated))



sabinWeb MCP
Go to Top of Page

stepson
Aged Yak Warrior

545 Posts

Posted - 2015-03-26 : 09:14:53
Also ,can you post some sample data to play with it ? We can generate some of it, but is better to have data clause to reals one.


sabinWeb MCP
Go to Top of Page

Indigo121
Starting Member

4 Posts

Posted - 2015-03-26 : 09:47:18
sabinWeb

1) I do not have indexes on User_dateCreated , or on UserReport_Date. only a clustered index on the primary keys of both tables (on user_id and userReport_id) and and some other foreign keys in Users.

2) This was my original syntax which was not more efficient unfortunately.

3) I'm trying this but it doesn't seem to show a major difference.

so you think an index could help?
Can I add an index User_dateCreated , or on UserReport_Date without them being unique?
Thanks a lot!
Go to Top of Page

Ifor
Aged Yak Warrior

700 Posts

Posted - 2015-03-26 : 09:56:58
If I guess User_dateCreated is in dbo.users, but not dbo.UserReports, try:

SELECT COUNT(*)
FROM dbo.users U
WHERE U.User_dateCreated >= @start
AND U.User_dateCreated <= @end
AND NOT EXISTS
(
SELECT 1
FROM dbo.UserReports R
WHERE R.[user_id] = S.[user_id]
AND R.UserReport_Date >= DATEADD(m, 1, U.User_dateCreated)
);

You should always alias columns so you know what table they come from.
When posting on forums, always provide consumable test data.
Go to Top of Page

stepson
Aged Yak Warrior

545 Posts

Posted - 2015-03-26 : 11:46:43
Yes, even indexes are not unique, you can create on User_dateCreated and test it ; then create (if still it is slow ) on user_id | UserReport_Date.

CREATE NONCLUSTERED INDEX IX_USERS_NC_User_dateCreated ON dbo.USERS(User_dateCreated)


then run the script and see if it bring some improvement.
also you can add :

CREATE NONCLUSTERED INDEX IX_UserReports_NC_User_id_UserReport_Date ON dbo.UserReports(User_id,UserReport_Date)






sabinWeb MCP
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2015-03-26 : 13:19:37
quote:
Originally posted by stepson

2- change
 AND DATEDIFF(m, @start, UserReport_Date) >= 1 ) 
to a more sargable search.

AND UserReport_Date>=DATEADD(m,1,User_dateCreated)


They are not the same queries.

SELECT DATEDIFF(MONTH, 'January 31 2015', 'February 1 2015') is equal to 1


Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA
Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2015-03-26 : 13:37:04
If you most often process the UserReports table based on user_id, then change the UserReports table to be uniquely clustered on (user_id, userReport_id ). You'll have to drop the existing index, first, since a table can have only one clustered index.

After the new clustered index is created, add a separate unique, nonclustered index/PK on userReport_id alone.

That will also automatically fix other performance issues whenever these tables are joined.
Go to Top of Page

stepson
Aged Yak Warrior

545 Posts

Posted - 2015-03-27 : 02:16:39
Thanks, you are right.

quote:
Originally posted by SwePeso

quote:
Originally posted by stepson

2- change
 AND DATEDIFF(m, @start, UserReport_Date) >= 1 ) 
to a more sargable search.

AND UserReport_Date>=DATEADD(m,1,User_dateCreated)


They are not the same queries.

SELECT DATEDIFF(MONTH, 'January 31 2015', 'February 1 2015') is equal to 1


Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA




sabinWeb MCP
Go to Top of Page

stepson
Aged Yak Warrior

545 Posts

Posted - 2015-03-27 : 02:57:52
this
AND UserReport_Date>=DATEADD(m,1,User_dateCreated)
wasn't much accurate

so ...
AND UserReport_Date >= DATEADD(Month,DATEDIFF(Month,0,User_dateCreated) + 1 ,0)



sabinWeb MCP
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-03-27 : 06:24:24
I'm curious whether

AND UserReport_Date >= DATEADD(Month,DATEDIFF(Month,0,User_dateCreated) + 1 ,0)

is more sargable than:

AND R.UserReport_Date >= DATEADD(m, 1, U.User_dateCreated)

(I have no idea how they compare )

Might another option be?? to have a computed field for the difference, or a persistent view ?
Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2015-03-27 : 15:51:08
You can never make that type of date comparison sargable.

But if the tables are properly clustered, you won't really need to, you'll still get good overall performance.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2015-03-27 : 16:16:05
Good point. I don't tend to program these sort of solutions to that sort of problem! so I don't have experience to naturally see the right answer to it either!
Go to Top of Page
   

- Advertisement -