I have a simple query for counting the distinct amount of users playing a game, however since the table has grown to 45 million rows its taking about 30 seconds. Is there any way to make this more efficient? The hashed match is taking about 37% of the execution plan and the rest is hitting the index. Due to the way this table works there can be many rows with the same username in the playername column.
SELECT COUNT(DISTINCT PlayerName) AS UniqueCount FROM [Players].[dbo].[PlayerStateChangeEvents]
Do you have an index on PlayerName? You might look at maintaining the count when the table is updated - especially if the count is small compared to the number of rows in the table. You could also make the index smaller by holding a player id rather than a name on this table - should also make the table smaller too.
========================================== Cursors are useful if you don't know sql. SSIS can be used in a similar way. Beer is not cold and it isn't fizzy.
No, only on ID is there a clustered index. Do you think adding a nonclustered index on PlayerName will help?
I'd love to be able to ID my players with a numerical value but thats not the way the data is inserted and cant really be changed. The Playername is data sent to me via way of XBox live/PSN etc.
>> Do you think adding a nonclustered index on PlayerName will help? Yes it would help this query. Might slow down other things though. You will probably find you are very limited if you only have a clustered index on an identity? Everything that doesn't use that column to filter will probably table scan.
========================================== Cursors are useful if you don't know sql. SSIS can be used in a similar way. Beer is not cold and it isn't fizzy.