Standard Deviation - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Standard Deviation

Author

Topic

Kristen
Test

22859 Posts

Posted - 2004-11-08 : 09:38:17

I've forgotten all I ever learnt about statistics at skool! and thus need some help.

We have a log table that records how long SProcs take to run. I'd like to optimise the ones that will give the most "payback" first.

Given a table with aggregates over, say, one month long the lines of

SELECT SProc_Name, COUNT(*), AVG(ElapsedTime), STDEV(ElapsedTime), VAR(ElapsedTime)
INTO MyStats
FROM MySProcLog
GROUP BY SProc_Name

I presume I should do something to analyse the 95% confidence limit on the STDEV to work out which SProcs to tackle first?

So how would I calculate that? (Or perhaps a better question is "What should I use to determine the most important ones to optimise?")

I reckon I need to go after the longest running ones which actually occur pretty frequently - I though about using COUNT(*) which is in the top half of the list and sorting by AVG ElapsedTime DESC - but there's probably something more scientific

Kristen

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2004-11-08 : 09:42:52

For a normal distribution 95% confidence intervals are at approx 2 SDs from the mean i.e. mean-2*sd to mean+2*sd. However, your run times are unlikely to be normally distributed.

Kristen
Test

22859 Posts

Posted - 2004-11-08 : 09:56:21

Thanks for the quick response Arnold. Hadn't thought that I might not have a Bell Curve; darnit!.

Any ideas for best plan to get to the most productive SProcs to optimise?

For example, I've a few biggies:

SProc COUNT(*) AVG(ms) STDEV
------------- -------- ------- -----
List Products 153,418 339 407
Basket Add 7,825 492 286
Login 6,488 335 279

List Product is called a lot, and takes a while. It does paging, so maybe that is rubbish (we did write that a LONG time ago) but apart from that function it might be hard to improve.

Basket Add is quite a hard job, worth looking into why it takes so long - no paging or anything, but plenty of "What price" and "Have we got stock" type validation lookups.

Login - Gawd Knows why that should take more than a couple of milliseconds to decide if the User/PWD is OK, and grab the list of "roles" the user has been granted.

And so the list goes on - I could do with "Statistically these are the order to tackle them in" - if such a formula can be contemplated!

Kristen

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-11-08 : 10:08:09

I would work on the ones that are not necessarily the longest, but run the most often. It seems that if you run a sp quite often then it is extremely important for it to be efficient. If it is run fewer times a month, then any efficiency would be nice but you wouldn't get as much of the multiplier effect.

Corey

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-08 : 11:15:04

I would look at
day/month, SProc COUNT(*) AVG(ms) STDEV SUM(ms) group by day/month, SProc
as well, as those with high sum are the procs that use up the most cpu power.

I am sure that you also have the time that the procs are run,
then it would be interesting to make a histogram to see the peak hours,
and locate the procs running during the peaks, and optimize those.

We made a lot of performance improvements by creating precalculated tables for calculations that were made often by several procs. The procs wold then reference the precalculated tables.

rockmoose

Kristen
Test

22859 Posts

Posted - 2004-11-08 : 12:05:31

7thNight: My most commonly run SProcs are taking only a few MS. Unlikely I can shave anything off them ...

rockmoose: SUM(ms) - I like that <ThumpsForehead>

Kristen

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-08 : 12:25:57

I am doing similar stats for our proc table:
LOG_PROCEDURE
(
procedure_id,
start,
finish,
errorcode
)
I take the last 16 runs of a proc.
Calculate the average time for those 16 runs per proc.
Remove the runs below average time.
Calculate the average of the remaining runs.
This gets me a an "average" run time for each proc.

Not very scientific, but hey!

rockmoose

Kristen
Test

22859 Posts

Posted - 2004-11-08 : 15:45:43

Sounds much like what we are logging.

How do you do your logging? I call a logging SProc at the top (with the Parameters, concatenated as a string, as a paremeter), and another at the bottom to "update" the Elapsed time + Return Value, of the SProc. Which seems like a lot of overhead ...

Kristen

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-08 : 17:48:48

A stored procedure generates the skeletons for all the procs with the logging code in place.

We log like this:
insert loggingtable(...) values(...) + an update at the end.
iof
exec spLogThis @prm1 = ... , , ,

During the execution of the proc, it also logs any interesting messages to a table,
that way we monitor each execution of the procs.

I am sure there are pros and cons to the direct insert approach,
but for our purposes having all the procs calling other logging procs
several times during each execution seemed like overhead.

rockmoose

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-08 : 17:51:39

-- example logging of procedures
-- dts, jobs also log into the same tables...


set nocount on
create table procedures
(
	id int identity not null primary key,
	catalog_id int not null default(1), -- references catalogs(id)
	proc_type_id int not null default(1), -- references proc_types(id) sp/job/fn/dts/exe
	proc_name sysname not null,
	proc_desc varchar(256),
	unique(catalog_id,proc_type_id,proc_name)
)

create table log_procedure
(
	id int identity not null primary key,
	procedure_id int not null references procedures(id),
	start datetime not null default(getdate()),
	finish datetime null,
	errorcode int null
)

create table log_proc_message
(
	id int not null identity primary key,
	log_procedure_id int not null references log_procedure(id),
	msg_date datetime not null default(getdate()),
	message varchar(256) not null
)

insert procedures(proc_name,proc_desc)
values('testlogging','just for testing some concepts...')
go

-- A script creates the skeleton procs/jobs
-- exec spCreateProc @id = 1

create procedure testlogging as
set nocount on

declare @procedure_id int, @log_procedure_id int
set @procedure_id = 1	-- this is my hardcoded id in procedures table

-- now I start executing
insert log_procedure(procedure_id) values(@procedure_id)
set @log_procedure_id = scope_identity()
insert log_proc_message(log_procedure_id,message) values(@log_procedure_id,'START EXECUTION : testlogging')

-- Now I write some messages during execution
insert log_proc_message(log_procedure_id,message) values(@log_procedure_id,'testlogging is executing')
	declare @t char(8)
	set @t = '00:00:0' + cast(2+datepart(s,getdate())%6 as char(1))	-- wait 2-8 secs
	waitfor delay @t
insert log_proc_message(log_procedure_id,message) values(@log_procedure_id,'testlogging inserted n rows into table x')
insert log_proc_message(log_procedure_id,message) values(@log_procedure_id,'testlogging has completed it''s job succesfully')

-- Now I am finished
insert log_proc_message(log_procedure_id,message) values(@log_procedure_id,'END EXECUTION : testlogging')
update log_procedure set finish = getdate(), errorcode = 0
where id = @log_procedure_id

return 0
go


-- execute the proc a few times
exec testlogging
exec testlogging
exec testlogging

-- view the log...
select
	left(p.proc_name,20) as proc_name,
	lpm.msg_date,
	case 	when lpm.message like('START%') then lpm.message
		when lpm.message like('END%') then lpm.message + ' EXECTIME: ' + ltrim(datediff(second,start,finish)) + ' seconds'
		else char(9) + lpm.message end as message
from
	procedures p
	join log_procedure lp on p.id = lp.procedure_id
	join log_proc_message lpm on lp.id = lpm.log_procedure_id
order by
	lpm.id desc

select * from procedures
select * from log_procedure order by id
select * from log_proc_message order by id

-- cleanup
drop table log_proc_message
drop table log_procedure
drop table procedures
drop procedure testlogging

rockmoose

Kristen
Test

22859 Posts

Posted - 2004-11-08 : 22:14:24

"having all the procs calling other logging procs several times during each execution seemed like overhead"

Absolutely agree, but we've changed the code for the logging more than once, and its a lot of INSERTS to fix otherwise!

Also we anticipated that if the server got very busy we could make the insert SProc "do nothing".

I also plan that the UPDATE Sproc will pass all the parameters that the CREATE Sproc used - and thus the CREATE SProc can be disabled - changing the UPDATE sproc to do an Insert instead of an Update. This would mean that we would lose any log info on SProcs that started, and then died - Syntax Error / Deadlock etc.

Do you rely on log_procedure.errorcode being NULL if an SProc failed? We set ours to an "impossible" value - I'm not sure if we might not accidentally UPDATE log_procedure.errorcode to NULL on exit from an SProc sometimes otherwise - mind you, I suppose that would be an error that needed looking into!

I also worry that our Update is expensive - we store up to 4 "special" values in the UPDATE SProc - as varchar(nnn) - which probably means the record won't fit into the original space and has to be moved.

I also plan to have an SProc to be called on Rollback. This would put all log table entries for the current SProc (i.e. and all the sub-Sprocs it EXEC'd) into an @TempTable, perform the ROLLBACK (which will removed the log entries [but not from the @TempTable], and then re-insert any now-missing entries to the Log table [with suitable "I get rollback'd" flag].

All our SProcs have a "Session ID" as the first parameter - thus we can extract from the logs all the SProcs that were called by a given user. Also, when we hit an error in the application [whilst in debugging mode] the application displays all the logs for the current session that are "greater" than the start point of the current operation - so we get a list of all the sprocs that the application called for this particular operation, and all the child SProcs etc.

But I do worry about the overhead! Mind you, the time taken to diagnose a problem in the Real World is equally important, and we can normally do that very quickly with all this information at our finger tips.

Kristen

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-10 : 18:04:29

"Absolutely agree, but we've changed the code for the logging more than once, and its a lot of INSERTS to fix otherwise!"

We decided early on the logging model to use, and if it ever changes, we decided that rewrite the INSERTS ( search&replace mostly ) would be acceptable. It hasn't happened yet.

But I certainly agree that sprocs are more flexible.

"Do you rely on log_procedure.errorcode being NULL if an SProc failed?"

No, we log the errorcode for the proc, of course the ones that fail because of syntax error etc, will get null.
Only succesful sprocs get errorcode 0, all the other ones get errorcode <> 0, null is "unknown" error :)

"I also worry that our Update is expensive - we store up to 4 "special" values in the UPDATE SProc - as varchar(nnn)"

We made the logging as simple as we possibly could :)

"I also plan to have an SProc to be called on Rollback."

Good idea, one of the problems that we have run into as well. Also solved by temporary table, but we have not sprocified! it.

"All our SProcs have a "Session ID" as the first parameter"

We don't bind them together like that, but it is one of the things we could do better.
We have batches running and if the batches run simultaneously we have to sort the logs out visually or by sql.
In the log_procedure table we have a column with suser_sname() as default, and any web apps have to provide the user that calls the sproc.

"But I do worry about the overhead!"
Maybe have a global logging flag?
if @global_logging > 0 -- if no logging then don't even call the logging sproc
exec spLogThis ....

rockmoose

Kristen
Test

22859 Posts

Posted - 2004-11-11 : 03:39:22

"Maybe have a global logging flag?"

I was thinking that we could allow the SP_LogMe to be called, but have it just return. I can't think of an easy way to get

if @global_logging > 0 EXEC SP_LogMe

into the existing Sprocs. Where would the @global_logging come from?

Kristen

rockmoose
SQL Natt Alfen

3279 Posts

Posted - 2004-11-11 : 03:48:18

"Where would the @global_logging come from?"

create proc x as
.
.
declare @global_logging int
set @global_logging = ( select log_level from dbo.log_settings ) /* log_settings has one row */
.
.
if @global_logging > 0 EXEC SP_LogMe -- you put them in master ?
.
.

Well you could just make spLogMe return as well.
Function calls (sp) are more expensive than variable declarations,
but I really can't see this make much performance gains,
unless the spLogMe gets called ridiculously many times

rockmoose

Kristen
Test

22859 Posts

Posted - 2004-11-11 : 10:44:13

Nah, they're not called "SP_..." nor stored in master - just my stupid poxy test example!

The "select log_level from dbo.log_settings" has a cost too :-(

I need a pre-processor directive!

"unless the spLogMe gets called ridiculously many times"

It does :-(

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources