Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Script Library
 Demo Performance Penalty of User Defined Functions

Author  Topic 

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-05 : 18:20:01
SQL Server User defined functions can be a powerful tool, but they can also create a substantial performance penalty in a query where they are called a large number of times. Sometimes it is something that must be accepted to get the job done, but there is often an alternative of putting the code from the function “in-line” in a SQL query. That also has a penalty in development time, so judgment about which way to go is called for.

I did some testing on three different methods of converting combinations of integer values of Year, Month, Day, Hour, Minute, and Second to Datetime values and compared the runtime of each. In the first method, I did the conversion in-line in the query. In the second method, I used a UDF to do the conversion using the same algorithm as the in-line query. In the third method, I used a UDF that called two more UDFs to do the conversion.

To perform the test, I loaded a table with 3,999,969 randomly generated date/times, along with the matching year, month, day, hour, minute, and seconds, in the range of 1753-01-01 to 9999-12-31. I re-indexed the table with fill factor of 100 to make the physical size as small as possible.

For the actual test, I ran queries that converted the year, month, day, hour, minute, and second on each row to a datatime, and compared it to the datetime from that row. I ran the query using the in-line conversion, single UDF (DateTime1), and with the UDF (DateTime2) that called two more UDFs (Date and Time). I ran the tests several times, and saw only minor variations in run time. The single UDF took over 8 times as long to run as the in-line conversion. The test with the UDF that called other UDFs took over 36 times as long to run as the in-line conversion, and took over 4 times as long to run as the single UDF.

These results show that there can be a substantial performance penalty for using a UDF in place of in-line code, and that UDFs that call other UDFs can also have a substantial performance penalty compared to a UDF that does not call other UDFs.




Code to load table with test data. The functions used in the script to load the test data can be found on these links:
Random Datetime Function:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=69499
Number Table Function:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685


create table T_DATE (
[DATE] datetime not null primary key clustered ,
[YEAR] smallint not null ,
[MONTH] tinyint not null ,
[DAY] tinyint not null ,
[HOUR] tinyint not null ,
[MINUTE] tinyint not null ,
[SECOND] tinyint not null
)

insert into T_DATE
select distinct top 100 percent
[DATE] = dateadd(ms,-datepart(ms,a.[DATE]),a.[DATE]),
[YEAR] = year(a.[DATE]),
[MONTH] = month(a.[DATE]),
[DAY] = day(a.[DATE]),
[HOUR] = datepart(hour,a.[DATE]),
[MINUTE] = datepart(minute,a.[DATE]),
[SECOND] = datepart(second,a.[DATE])
from
(
select top 100 percent
[DATE] =
[dbo].[F_RANDOM_DATETIME]( '17530101', '99991231',newid() )
from
f_table_number_range(1,4000000) aa
order by
1
) a
order by
a.[DATE]

dbcc dbreindex(T_DATE,'',100)

exec sp_spaceused 'T_DATE','true'

select count(*) from T_DATE



Code to create functions used in the test. These functions are based on functions that Jeff posted in his blog on this link, modified with some suggestions of mine:
http://weblogs.sqlteam.com/jeffs/archive/2007/01/02/56079.aspx


create function DateTime1
(@Year int, @Month int, @Day int, @Hour int, @Minute int, @Second int)
-- returns a dateTime value for the date and time specified.
returns datetime
as
begin
return dateadd(month,((@Year-1900)*12)+@Month-1,
dateadd(ss,(@Hour*3600)+(@Minute*60)+@Second,@Day-1))
end
go
create function Date(@Year int, @Month int, @Day int)
-- returns a datetime value for the specified year, month and day
returns datetime
as
begin
return dateadd(month,((@Year-1900)*12)+@Month-1,@Day-1)
end
go

create function Time(@Hour int, @Minute int, @Second int)
-- Returns a datetime value for the specified time at the "base" date (1/1/1900)
returns datetime
as
begin
return dateadd(ss,(@Hour*3600)+(@Minute*60)+@Second,0)
end
go
create function DateTime2
(@Year int, @Month int, @Day int, @Hour int, @Minute int, @Second int)
-- returns a dateTime value for the date and time specified.
returns datetime
as
begin
return dbo.Date(@Year,@Month,@Day) + dbo.Time(@Hour, @Minute,@Second)
end
go



Test code:


set nocount on
go
select [T_DATE Rowcount] = count(*) from T_DATE
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> 0+a.[DATE]

select [MS No Action] = datediff(ms,0,getdate()-@st)
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <>
dateadd(month,((a.YEAR-1900)*12)+a.MONTH-1,
dateadd(ss,(a.HOUR*3600)+(a.MINUTE*60)+a.SECOND,a.DAY-1))

select [MS No Function] = datediff(ms,0,getdate()-@st)
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> dbo.DateTime1(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)

select [MS DateTime1] = datediff(ms,0,getdate()-@st)
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> dbo.DateTime2(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)

select [MS DateTime2] = datediff(ms,0,getdate()-@st)
go



Sample test results:


T_DATE Rowcount
---------------
3999969

MS No Action
------------
1773

MS No Function
--------------
9923

MS DateTime1
------------
82213

MS DateTime2
------------
357683






CODO ERGO SUM

Kristen
Test

22859 Posts

Posted - 2007-02-06 : 10:42:40
MVJ: Very useful, if somewhat scary!

I'm interested to know the effect in the SELECT, rather than the WHERE. Could you run a test something like this please?

declare @MyDate datetime
declare @st datetime
select @st = getdate()

select
@MyDate = dbo.DateTime2(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)
from
T_DATE a

select [Test Label] = datediff(ms,0,getdate()-@st)
go

Thanks

Kristen
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-06 : 22:53:37
Here is the test code and the results of two test runs. This testing was conducted on my fairly old home computer.

As you can see, it didn't make much difference in the relative runtime.

I'm sure that a fast, modern server could do much better with the UDFs, but I thought this was a good illustration of the potential impact. The results would certainly make me think twice before using a UDF in a 10 million row ETL process. I would want to at least test the in-line alternative. And definitely watch out for functions that call other functions or are recursive.



set nocount on
go
select [T_DATE Rowcount] = count(*) from T_DATE
go
declare @MyDate datetime
declare @st datetime
select @st = getdate()

select
@MyDate = 0+a.[DATE]
from
T_DATE a

select [MS No Action] = datediff(ms,0,getdate()-@st)
go
declare @MyDate datetime
declare @st datetime
select @st = getdate()

select
@MyDate =
dateadd(month,((a.YEAR-1900)*12)+a.MONTH-1,
dateadd(ss,(a.HOUR*3600)+(a.MINUTE*60)+a.SECOND,a.DAY-1))
from
T_DATE a

select [MS No Function] = datediff(ms,0,getdate()-@st)
go
declare @MyDate datetime
declare @st datetime
select @st = getdate()

select
@MyDate = dbo.DateTime1(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)
from
T_DATE a

select [MS DateTime1] = datediff(ms,0,getdate()-@st)
go
declare @MyDate datetime
declare @st datetime
select @st = getdate()

select
@MyDate = dbo.DateTime2(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)
from
T_DATE a

select [MS DateTime2] = datediff(ms,0,getdate()-@st)
go


Test results:

Test #1

T_DATE Rowcount
---------------
3999962

MS No Action
------------
3086

MS No Function
--------------
13366

MS DateTime1
------------
80890

MS DateTime2
------------
338186



Test #2

T_DATE Rowcount
---------------
3999962

MS No Action
------------
3083

MS No Function
--------------
13060

MS DateTime1
------------
80830

MS DateTime2
------------
337833






CODO ERGO SUM
Go to Top of Page

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2007-02-07 : 00:53:52
I'll have to dig out the examples I did in an "exchange of ideas" with another... on a modern server, many of the UDF's beat the inline code depending, of course, on how the UDF was written and some made it worse.

--Jeff Moden
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-07 : 02:29:05
Thanks for that MVJ

"As you can see, it didn't make much difference in the relative runtime"

Am I reading it wrongly? Looked like a big difference to me.

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-02-07 : 02:49:53
[code]Test #1 Test #2 Diff in percent

T_DATE Rowcount T_DATE Rowcount
--------------- --------------- ---------------
3999962 3999962 0.00%

MS No Action MS No Action
------------ ------------ ---------------
3086 3083 0.09%

MS No Function MS No Function
-------------- ------------ ---------------
13366 13060 2.29%

MS DateTime1 MS DateTime1
------------ ------------ ---------------
80890 80830 0.07%

MS DateTime2 MS DateTime2
------------ ------------ ---------------
338186 337833 0.10%[/code]

Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-07 : 03:58:42
Ah, thanks Peso.

I was comparing [MS No Function] and [MS DateTime1] and deciding that this type of function in the SELECT was rubbish!

I had already decided it was Rubbish in the WHERE clause ...

So, UDFs are a bad idea then ... except for the ones that Jeff mentions.

We've centralised quite a lot of critical code in UDFs, I can see I need to go review that decision. Blast!

Kristen
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-07 : 14:17:11
I would not go as far as to say that UDFs are bad ideas, only that there are situations where the performance penalty may justify using inline code. Also, there may be situations where it is just not practical to use inline code.

By the way, I think this really only applies to scalar UDFs, not table valued UDFs.




CODO ERGO SUM
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-07 : 17:01:42
In order to see if the server made much difference, I ran the test script below on a more powerful system: Dual 3.4 GHz Xeon CPU, 3.5 GB of RAM, Windows Server 2003 SP1, SQL Server 2000 Enterprise Edition, SP4.

I increased the size of the test table to 9,999,792 rows to give it more data to work with. These tests were run on an idle system. The tests all ran much faster; no surprise since the CPU and other resources are much faster. I ran the test 3 times, and there was very little variation in the results.

The results show that there was an even larger difference between the run time of the inline code, compared to the two functions. The run time using the DateTime1 function was over 26 times as long as the inline code, and run time using the DateTime2 function was over 106 times as long as the inline code. The run time using DateTime2, the function that calls two more functions, was almost 4 times as long a DateTime1.

This test confirms the results of my first series of tests on my desktop computer, and shows an even greater performance penalty for the functions compared to the inline code.


Test Script:

set nocount on
go
select [T_DATE Rowcount] = count(*) from T_DATE
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> 0+a.[DATE]

select [MS No Action] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <>
dateadd(month,((a.YEAR-1900)*12)+a.MONTH-1,
dateadd(ss,(a.HOUR*3600)+(a.MINUTE*60)+a.SECOND,a.DAY-1))

select [MS No Function] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> dbo.DateTime1(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)

select [MS DateTime1] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> dbo.DateTime2(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)

select [MS DateTime2] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go



Test Results:

Test #1

T_DATE Rowcount
---------------
9999792

MS No Action Error Count
------------ -----------
593 0

MS No Function Error Count
-------------- -----------
2860 0

MS DateTime1 Error Count
------------ -----------
77186 0

MS DateTime2 Error Count
------------ -----------
304840 0



Test #2

T_DATE Rowcount
---------------
9999792

MS No Action Error Count
------------ -----------
580 0

MS No Function Error Count
-------------- -----------
2856 0

MS DateTime1 Error Count
------------ -----------
78720 0

MS DateTime2 Error Count
------------ -----------
304353 0



Test #3

T_DATE Rowcount
---------------
9999792

MS No Action Error Count
------------ -----------
580 0

MS No Function Error Count
-------------- -----------
2873 0

MS DateTime1 Error Count
------------ -----------
77203 0

MS DateTime2 Error Count
------------ -----------
305573 0


--Calculate relative runtimes:
select [Inline to Datetime1] = 77186./2860.,
[Inline to Datetime2] = 304840./2860.,
[Datetime1 to Datetime2] = 304840./77186. union all
select 78720./2856., 304353./2856., 304353./78720. union all
select 77203./2873., 305573./2873., 305573./77203.


Inline to Datetime1 Inline to Datetime2 Datetime1 to Datetime2
------------------- ------------------- ----------------------
26.988111 106.587412 3.949420
27.563025 106.566176 3.866272
26.871910 106.360250 3.958045






CODO ERGO SUM
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-07 : 17:32:41
quote:
Originally posted by Jeff Moden

I'll have to dig out the examples I did in an "exchange of ideas" with another... on a modern server, many of the UDF's beat the inline code depending, of course, on how the UDF was written and some made it worse.

--Jeff Moden


I think I may know the thread you are talking about; at least I saw something like it on SQLServerCentral a while ago. I don't have the link to it, so I can't look back at it right now.

If that is the one you are talking about, I was unconvinced from the examples that UDFs could beat the inline code in a controlled test. Most of the examples that I saw on that thread seemed to be tests using live data on a live system, so it was hard to eliminate the impact of other factors. I had run some quick tests before, and saw a big difference in run time, so I decided to do my own investigation.

That is why I tried to eliminate any other factors in these tests. Nothing else was running on the systems. The SQL was just a straight table scan with all the data cached in memory and returning only a single row result set into a local variable. The only variation in the SQL code was the use of the inline code vs. the functions.

I would be interested in seeing a test that shows that a UDF can perform as well as identical inline code.

I would also be interested in seeing this test done under SQL Server 2005. I don’t have a SQL 2005 server available that I can “idle” out for a test, and the load of the test data table takes substantial resources, so I prefer not to have angry users after me.






CODO ERGO SUM
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-07 : 23:10:24
I tested again with SQL Server 2005 Developer Edition on a desktop computer. This test showed the least difference between inline code and the 2 functions, but the runtime with function Datetime1 was still 5 times as long as the inline code, and the runtime with function Datetime2 that calls 2 other functions was 26 times as long as the inline code.

It is still a very substantial performance difference. With the inline code, it was able to process 935,000 rows/second, using function Datetime1 dropped the speed to 185,000 rows/second, and using function Datetime2 dropped the speed to 35,000 rows/second.


Test Results Analysis:

select
Test,
[Inline to Datetime1] = D1/NF,
[Inline to Datetime2] = D2/NF,
[Datetime1 to Datetime2] =D2/D1
from
(
select Test= '1',NF=5346.0,D1=27046.0,D2=143450.0 union all
select '2',5266.0,26766.0,142296.0 union all
select '3',5283.0,26843.0,142703.0 union all
select '4',5280.0,26826.0,142420.0
) a


Test Inline to Datetime1 Inline to Datetime2 Datetime1 to Datetime2
---- ------------------- ------------------- ----------------------
1 5.0591096 26.8331462 5.30392664
2 5.0827952 27.0216483 5.31629679
3 5.0810145 27.0117357 5.31620906
4 5.0806818 26.9734848 5.30902855



Test Results:

Test 1
T_DATE Rowcount
---------------
4999950

MS No Action Error Count
------------ -----------
1140 0

MS No Function Error Count
-------------- -----------
5346 0

MS DateTime1 Error Count
------------ -----------
27046 0

MS DateTime2 Error Count
------------ -----------
143450 0

Test 2
T_DATE Rowcount
---------------
4999950

MS No Action Error Count
------------ -----------
1123 0

MS No Function Error Count
-------------- -----------
5266 0

MS DateTime1 Error Count
------------ -----------
26766 0

MS DateTime2 Error Count
------------ -----------
142296 0

Test 3
T_DATE Rowcount
---------------
4999950

MS No Action Error Count
------------ -----------
1123 0

MS No Function Error Count
-------------- -----------
5283 0

MS DateTime1 Error Count
------------ -----------
26843 0

MS DateTime2 Error Count
------------ -----------
142703 0

Test 4
T_DATE Rowcount
---------------
4999950

MS No Action Error Count
------------ -----------
1126 0

MS No Function Error Count
-------------- -----------
5280 0

MS DateTime1 Error Count
------------ -----------
26826 0

MS DateTime2 Error Count
------------ -----------
142420 0


Test script:

set nocount on
go
print 'Test 1'
go
select [T_DATE Rowcount] = count(*) from T_DATE
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> 0+a.[DATE]

select [MS No Action] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <>
dateadd(month,((a.YEAR-1900)*12)+a.MONTH-1,
dateadd(ss,(a.HOUR*3600)+(a.MINUTE*60)+a.SECOND,a.DAY-1))

select [MS No Function] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> dbo.DateTime1(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)

select [MS DateTime1] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go
declare @count int
declare @st datetime
select @st = getdate()

select
@count = count(*)
from
T_DATE a
where
a.[DATE] <> dbo.DateTime2(a.YEAR,a.MONTH,a.DAY,a.HOUR,a.MINUTE,a.SECOND)

select [MS DateTime2] = datediff(ms,0,getdate()-@st), [Error Count] = @count
go


CODO ERGO SUM
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-08 : 07:44:32
Very useful post MVJ, thanks.

Kristen
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-08 : 08:57:04
quote:
Originally posted by Kristen

Very useful post MVJ, thanks.

Kristen



So, are you redoing all your systems to replace functions with inline code?

CODO ERGO SUM
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-08 : 09:02:36
Well ... that part of your findings was NOT so helpful [:-(]
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2007-02-08 : 11:32:42
quote:
Originally posted by Kristen

Well ... that part of your findings was NOT so helpful [:-(]


Can’t you just assign that to one of your lackeys?



CODO ERGO SUM
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-02-08 : 11:59:22
I'm now worried about the number of CPU cycles that will generate. So I rather feel I may have to fix the SQL before I use the Lackey Assignment Tool any more ...
Go to Top of Page
   

- Advertisement -