| 
                
Please start any new threads on our new 
    site at https://forums.sqlteam.com.  We've got lots of great SQL Server
    experts to answer whatever question you can come up with. 
    
        | 
                
                    | 
                            
                                | Author | Topic |  
                                    | Indigo121Starting Member
 
 
                                        4 Posts | 
                                            
                                            |  Posted - 2015-03-26 : 06:47:39 
 |  
                                            | HiI have a nested query which runs very very slowly over large tables. Please help me refine it (if that is possible). The query:SELECT count(*) FROM dbo.users WHERE User_dateCreated  >= @start AND User_dateCreated <= @end  AND user_id NOT IN	(SELECT user_id from .[dbo].[UserReports]	 WHERE user_id = [dbo].[UserReports].user_id	 AND DATEDIFF(m, User_dateCreated, UserReport_Date) >= 1 )General explanation: the query counts users (whose table contains a few thousand rows) created in a certain time span, and searches for those users who don't have rows under their IDs in the UserReports table (which is very large- 2 million rows) in a certain time span.I found that if I only check reports compared to a FIXED start date (@start) and not compared to their specific creation date (User_dateCreated) in the "upper" table, the query runs MUCH faster. In other words with this line at the end-AND DATEDIFF(m, @start, UserReport_Date) >= 1 )Unfortunately I do need to compare to each's specific creation date. Please let me know how I can solve this- rewrite query, add indexes or whatever to tables, etc.THANKS!!! |  |  
                                    | stepsonAged Yak Warrior
 
 
                                    545 Posts | 
                                        
                                          |  Posted - 2015-03-26 : 09:12:48 
 |  
                                          | 1- What kind on indexes do you have ? do you have index on  User_dateCreated , or on  UserReport_Date?2- change  AND DATEDIFF(m, @start, UserReport_Date) >= 1 ) to a more sargable search. AND UserReport_Date>=DATEADD(m,1,User_dateCreated)3- try to change NOT IN in something like NOT EXISTS NOT EXISTS(SELECT * from .[dbo].[UserReports]WHERE user_id = [dbo].[UserReports].user_idAND UserReport_Date>=DATEADD(m,1,User_dateCreated))sabinWeb MCP |  
                                          |  |  |  
                                    | stepsonAged Yak Warrior
 
 
                                    545 Posts | 
                                        
                                          |  Posted - 2015-03-26 : 09:14:53 
 |  
                                          | Also ,can you post some sample data to play with it ? We can generate some of it, but is better to have data clause to reals one.sabinWeb MCP |  
                                          |  |  |  
                                    | Indigo121Starting Member
 
 
                                    4 Posts | 
                                        
                                          |  Posted - 2015-03-26 : 09:47:18 
 |  
                                          | sabinWeb 1) I do not have indexes on User_dateCreated , or on UserReport_Date. only a clustered index on the primary keys of both tables (on user_id and userReport_id) and and some other foreign keys in Users. 2) This was my original syntax which was not more efficient unfortunately.3) I'm trying this but it doesn't seem to show a major difference.so you think an index could help? Can I add an index User_dateCreated , or on UserReport_Date without them being unique? Thanks a lot! |  
                                          |  |  |  
                                    | IforAged Yak Warrior
 
 
                                    700 Posts | 
                                        
                                          |  Posted - 2015-03-26 : 09:56:58 
 |  
                                          | If I guess User_dateCreated is in dbo.users, but not dbo.UserReports, try: SELECT COUNT(*)FROM dbo.users UWHERE U.User_dateCreated >= @start	AND U.User_dateCreated <= @end	AND NOT EXISTS	(		SELECT 1		FROM dbo.UserReports R		WHERE R.[user_id] = S.[user_id]			AND R.UserReport_Date >= DATEADD(m, 1, U.User_dateCreated)	);You should always alias columns so you know what table they come from.When posting on forums, always provide consumable test data. |  
                                          |  |  |  
                                    | stepsonAged Yak Warrior
 
 
                                    545 Posts | 
                                        
                                          |  Posted - 2015-03-26 : 11:46:43 
 |  
                                          | Yes, even indexes are not unique, you can create on User_dateCreated and test it ; then create (if still it is slow ) on user_id | UserReport_Date. CREATE NONCLUSTERED INDEX IX_USERS_NC_User_dateCreated ON dbo.USERS(User_dateCreated)then run the script and see if it bring some improvement.also you can add : CREATE NONCLUSTERED INDEX IX_UserReports_NC_User_id_UserReport_Date ON dbo.UserReports(User_id,UserReport_Date)sabinWeb MCP |  
                                          |  |  |  
                                    | SwePesoPatron Saint of Lost Yaks
 
 
                                    30421 Posts | 
                                        
                                          |  Posted - 2015-03-26 : 13:19:37 
 |  
                                          | quote:They are not the same queries.SELECT DATEDIFF(MONTH, 'January 31 2015', 'February 1 2015') is equal to 1Originally posted by stepson
 2- change
  AND DATEDIFF(m, @start, UserReport_Date) >= 1 ) to a more sargable search. AND UserReport_Date>=DATEADD(m,1,User_dateCreated) 
 Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA
 |  
                                          |  |  |  
                                    | ScottPletcherAged Yak Warrior
 
 
                                    550 Posts | 
                                        
                                          |  Posted - 2015-03-26 : 13:37:04 
 |  
                                          | If you most often process the UserReports table based on user_id, then change the UserReports table to be uniquely clustered on (user_id, userReport_id ).  You'll have to drop the existing index, first, since a table can have only one clustered index.After the new clustered index is created, add a separate unique, nonclustered index/PK on userReport_id alone.That will also automatically fix other performance issues whenever these tables are joined. |  
                                          |  |  |  
                                    | stepsonAged Yak Warrior
 
 
                                    545 Posts | 
                                        
                                          |  Posted - 2015-03-27 : 02:16:39 
 |  
                                          | Thanks, you are right. quote:sabinWeb MCPOriginally posted by SwePeso
 
 quote:They are not the same queries.SELECT DATEDIFF(MONTH, 'January 31 2015', 'February 1 2015') is equal to 1Originally posted by stepson
 2- change
  AND DATEDIFF(m, @start, UserReport_Date) >= 1 ) to a more sargable search. AND UserReport_Date>=DATEADD(m,1,User_dateCreated) 
 Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA
 
 |  
                                          |  |  |  
                                    | stepsonAged Yak Warrior
 
 
                                    545 Posts | 
                                        
                                          |  Posted - 2015-03-27 : 02:57:52 
 |  
                                          | this wasn't much accurateso ...AND UserReport_Date>=DATEADD(m,1,User_dateCreated)
 AND UserReport_Date >= DATEADD(Month,DATEDIFF(Month,0,User_dateCreated) + 1 ,0)sabinWeb MCP |  
                                          |  |  |  
                                    | KristenTest
 
 
                                    22859 Posts | 
                                        
                                          |  Posted - 2015-03-27 : 06:24:24 
 |  
                                          | I'm curious whether AND UserReport_Date >= DATEADD(Month,DATEDIFF(Month,0,User_dateCreated) + 1 ,0)is more sargable than: AND R.UserReport_Date >= DATEADD(m, 1, U.User_dateCreated)(I have no idea how they compare  )Might another option be?? to have a computed field for the difference, or a persistent view ? |  
                                          |  |  |  
                                    | ScottPletcherAged Yak Warrior
 
 
                                    550 Posts | 
                                        
                                          |  Posted - 2015-03-27 : 15:51:08 
 |  
                                          | You can never make that type of date comparison sargable.But if the tables are properly clustered, you won't really need to, you'll still get good overall performance. |  
                                          |  |  |  
                                    | KristenTest
 
 
                                    22859 Posts | 
                                        
                                          |  Posted - 2015-03-27 : 16:16:05 
 |  
                                          | Good point.  I don't tend to program these sort of solutions to that sort of problem! so I don't have experience to naturally see the right answer to it either! |  
                                          |  |  |  
                                |  |  |  |  |  |