Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2012 Forums
 SQL Server Administration (2012)
 What's the size for average statistics object

Author  Topic 

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2014-06-19 : 15:27:51
I am trying to delete very old statistics objects from our environment. I am using following query:


SELECT OBJECT_NAME(id) ObjectName,
name,
STATS_DATE(id, indid) AS DateTaken,
rowmodctr
FROM sys.sysindexes
WHERE STATS_DATE(id, indid)<=DATEADD(DAY,-1,GETDATE())
ORDER BY DateTaken;


I noticed very old statistics objects on most of my databases. Some of them automatically created by MS-SQL (those started by _WA_Sys).

Should I really be worried or delete those with rowmodctr equals to zero and very old dates? In terms of space, what I could gain?

I do not know exactly what the average size would be for statistics object. Maybe a few bytes?

I guess that in terms of performance, if the rowmodctr is zero and last time it was refreshed was three years ago, I'm getting nothing from it?

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-06-19 : 15:45:17
Yeah this isn't worth your time. Statistics are small. What you should be looking at instead is unused indexes, but you have to proceed with caution.

Here is a script that will show you index usage. What you are looking for here is reads=0 and high number of writes. I've dropped indexes that had low reads but not reads=0 because I knew there was another index that would takes its place for those few reads. The caution part of this is you need to run this script once an entire workload has been completed. This DMV data gets wiped out when SQL is restarted. So you definitely would not base your decisions when SQL was recently restarted. If you restarted SQL a month ago, but there's some quarterly report that hasn't yet run and might need the index you are considering to drop, you may end up in a world of hurt. This happened to me. There was a process that only ran monthly, and I ran the script 3 weeks after the restart as I thought we had completed an entire workload. Then that process ran. CPU was 90-100% and causing widespread issues. I was paged on a weekend to figure out what was going on. I saw the long-running query was doing a table scan and then remembered I had dropped an index on that table. I knew what the issue was. I killed the process, added the index back and then restarted that process. Everything was okay. This was an extremely critical database with extreme performance requirements (most queries were required to finish within 300 milliseconds). I learned my lesson.

Proceed with caution!


SELECT
o.name
, indexname=i.name
, i.index_id
, reads=user_seeks + user_scans + user_lookups
, writes = user_updates
, rows = (SELECT SUM(p.rows) FROM sys.partitions p WHERE p.index_id = s.index_id AND s.object_id = p.object_id)
, CASE
WHEN s.user_updates < 1 THEN 100
ELSE 1.00 * (s.user_seeks + s.user_scans + s.user_lookups) / s.user_updates

END AS reads_per_write

, 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(c.name) + '.' + QUOTENAME(OBJECT_NAME(s.object_id)) as 'drop statement'
FROM sys.dm_db_index_usage_stats s
INNER JOIN sys.indexes i ON i.index_id = s.index_id AND s.object_id = i.object_id
INNER JOIN sys.objects o on s.object_id = o.object_id
INNER JOIN sys.schemas c on o.schema_id = c.schema_id
WHERE OBJECTPROPERTY(s.object_id,'IsUserTable') = 1
AND s.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
AND (SELECT SUM(p.rows) FROM sys.partitions p WHERE p.index_id = s.index_id AND s.object_id = p.object_id) > 10000
ORDER BY reads



Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2014-06-19 : 16:09:49
quote:
Originally posted by tkizer

Yeah this isn't worth your time. Statistics are small. What you should be looking at instead is unused indexes, but you have to proceed with caution.

Here is a script that will show you index usage. What you are looking for here is reads=0 and high number of writes. I've dropped indexes that had low reads but not reads=0 because I knew there was another index that would takes its place for those few reads. The caution part of this is you need to run this script once an entire workload has been completed. This DMV data gets wiped out when SQL is restarted. So you definitely would not base your decisions when SQL was recently restarted. If you restarted SQL a month ago, but there's some quarterly report that hasn't yet run and might need the index you are considering to drop, you may end up in a world of hurt. This happened to me. There was a process that only ran monthly, and I ran the script 3 weeks after the restart as I thought we had completed an entire workload. Then that process ran. CPU was 90-100% and causing widespread issues. I was paged on a weekend to figure out what was going on. I saw the long-running query was doing a table scan and then remembered I had dropped an index on that table. I knew what the issue was. I killed the process, added the index back and then restarted that process. Everything was okay. This was an extremely critical database with extreme performance requirements (most queries were required to finish within 300 milliseconds). I learned my lesson.

Proceed with caution!


SELECT
o.name
, indexname=i.name
, i.index_id
, reads=user_seeks + user_scans + user_lookups
, writes = user_updates
, rows = (SELECT SUM(p.rows) FROM sys.partitions p WHERE p.index_id = s.index_id AND s.object_id = p.object_id)
, CASE
WHEN s.user_updates < 1 THEN 100
ELSE 1.00 * (s.user_seeks + s.user_scans + s.user_lookups) / s.user_updates

END AS reads_per_write

, 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(c.name) + '.' + QUOTENAME(OBJECT_NAME(s.object_id)) as 'drop statement'
FROM sys.dm_db_index_usage_stats s
INNER JOIN sys.indexes i ON i.index_id = s.index_id AND s.object_id = i.object_id
INNER JOIN sys.objects o on s.object_id = o.object_id
INNER JOIN sys.schemas c on o.schema_id = c.schema_id
WHERE OBJECTPROPERTY(s.object_id,'IsUserTable') = 1
AND s.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
AND (SELECT SUM(p.rows) FROM sys.partitions p WHERE p.index_id = s.index_id AND s.object_id = p.object_id) > 10000
ORDER BY reads



Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/



Thank you!

Exactly what I thought. This was asked by management team and I tried to explain it.

What I am suggesting, is dropping unused or duplicate Indexes (after carefully review) and improve algorithm or job that refreshes statistics. Currently, statistics are being refreshed weekly only, causing lot of performance issues.

Out of curiosity. What's the average size of each statics object, any idea? I think it should be ridiculously small, like most MS-SQL metadata, but I don't have the exact number.
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2014-06-19 : 16:29:49
Also, a follow up to this statement:


What you are looking for here is reads=0 and high number of writes


Don't I have to be careful with dropping Indexes with a high number of writes? I mean ... with very intensive UPDATE/DELETE batches, won't the DMV show a high number of writes for that type of queries?

Dropping an Index that shows high number of writes could kill the performance of such queries. Let's say ...


UPDATE MyTable
SET column1=value1
WHERE column2=value2;


And let's say I have an Index on column1.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-06-19 : 16:36:51
The writes column is regarding keeping the index up to date due to DML (INSERT/UPDATE/DELETE). It is not in regards to reading the data pages due to a WHERE clause.

An index on column1 for your provided query would not be used. An index on column2 would. column2's index would be read, and column1's index would have to be maintained (writes) for your example. So column1's index is useless here and causing overhead.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2014-06-19 : 16:45:57
quote:
Originally posted by tkizer

The writes column is regarding keeping the index up to date due to DML (INSERT/UPDATE/DELETE). It is not in regards to reading the data pages due to a WHERE clause.

An index on column1 for your provided query would not be used. An index on column2 would. column2's index would be read, and column1's index would have to be maintained (writes) for your example. So column1's index is useless here and causing overhead.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/



So an Index with zero reads (after SQL being running for enough time, of course) and high writes, can be safely dropped?
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-06-19 : 17:00:23
There is still risk as how do you know if an entire business cycle/workload has been run since SQL has been restarted. You could have a yearly process. At my last company, we rebooted the database servers almost every month as they had a policy of being up-to-date for the monthly Microsoft security patches. That doesn't give you enough info if you have processes or any workload that could run outside of those 30 days. What I did was waited 3-4 months, saving the output of this query each month and comparing. If the same index kept showing reads=0, then I made plans to have it dropped. I made sure that a load test was conducted in our performance lab.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2014-06-20 : 09:27:12
quote:
Originally posted by tkizer

There is still risk as how do you know if an entire business cycle/workload has been run since SQL has been restarted. You could have a yearly process. At my last company, we rebooted the database servers almost every month as they had a policy of being up-to-date for the monthly Microsoft security patches. That doesn't give you enough info if you have processes or any workload that could run outside of those 30 days. What I did was waited 3-4 months, saving the output of this query each month and comparing. If the same index kept showing reads=0, then I made plans to have it dropped. I made sure that a load test was conducted in our performance lab.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/



Great advice.

Thank you!
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2014-06-20 : 11:20:56
Tara,

I was comparing at work the DMV/query you gave me against this one:


SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT
o.name as [Object Name],
s.index_id as [Index ID],
i.name as [Index Name],
i.type_desc as [Index Type],
s.user_seeks + s.user_scans + s.user_lookups as [Total Queries Which Read],
s.user_updates [Total Queries Which Wrote] ,
ps.row_count as [Row Count],
CASE WHEN s.user_updates < 1 THEN 100
ELSE ( s.user_seeks + s.user_scans + s.user_lookups ) / s.user_updates * 1.0
END AS [Reads Per Write]
FROM sys.dm_db_index_usage_stats s
JOIN sys.dm_db_partition_stats ps on s.object_id=ps.object_id
and s.index_id=ps.index_id
JOIN sys.indexes i ON i.index_id = s.index_id
AND s.object_id = i.object_id
JOIN sys.objects o ON s.object_id = o.object_id

JOIN sys.schemas c ON o.schema_id = c.schema_id
WHERE o.name NOT LIKE 'sys%'
--AND i.name NOT LIKE 'PK_%'
AND i.type_desc ='NONCLUSTERED'
ORDER BY [Total Queries Which Read] ASC, [Row Count] DESC;


And I can't understand why I am getting several reads values for an specific Index. Yours provides a single read value for a single Index.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-06-20 : 12:43:43
When I run that query, I don't see any "dupes". Not sure why you are seeing dupes, but I would think it's due to object names being the same but with different schema names. Or maybe you've got partitioned tables, which the query I provided handles that condition.

But why use a different query? Just use the one I provided. I didn't write it. I stole it from here: http://www.toadworld.com/platforms/sql-server/w/wiki/10062.find-indexes-not-in-use.aspx. The author is Brent Ozar.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

sql-lover
Yak Posting Veteran

99 Posts

Posted - 2014-06-20 : 13:08:46
quote:
Originally posted by tkizer

When I run that query, I don't see any "dupes". Not sure why you are seeing dupes, but I would think it's due to object names being the same but with different schema names. Or maybe you've got partitioned tables, which the query I provided handles that condition.

But why use a different query? Just use the one I provided. I didn't write it. I stole it from here: http://www.toadworld.com/platforms/sql-server/w/wiki/10062.find-indexes-not-in-use.aspx. The author is Brent Ozar.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/



I'll use yours for sure, I just want to learn/understand why I am getting a different set.

This is what I am talking about (simplified)

IndexName Reads
NCI_ABC 5513
NCI_ABC 23039

With yours or Brent's, which I believe is correct, I get this instead:

IndexName Reads
NCI_ABC 23039


What's the 1st row on the other DMV? The reads from upper leafs before hitting the Cluster Index?

Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-06-20 : 13:17:57
I am not sure.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page
   

- Advertisement -