Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 UDF vs subquery

Author  Topic 

pixelwiz
Starting Member

25 Posts

Posted - 2010-12-03 : 16:00:22
Hi All,

I've been trying to learn how to properly use stored procedures and functions. My understanding was that using udf functions inside my SQL code should speed things up because they become pre-compiled code and can execute faster.

However, I've recently ran into a scenario where it is the exact opposite. If I have a calculated column determined by a sub-query it runs in a split second. However, if I use a UDF, it takes 8 seconds to do the exact same thing.

I'll try to paste the code below for both cases.

This runs real fast
      SELECT g.*, c.course_name,
CASE WHEN
(
SELECT COUNT(*)
FROM groups g2
INNER JOIN group_players gp
ON gp.group_id = g2.id
INNER JOIN shots s
ON s.player_id = gp.player_id
AND s.round = g2.round
AND s.hole_id IN (SELECT id FROM holes WHERE event_id = g2.event_id AND course_id = g2.course_id)
WHERE g2.id = g.id
) > 0
OR
(
SELECT COUNT(*)
FROM groups g2
INNER JOIN group_players gp
ON gp.group_id = g2.id
INNER JOIN player_holes ph
ON ph.player_id = gp.player_id
AND ph.round = g2.round
AND ph.hole_id IN (SELECT id FROM holes WHERE event_id = g2.event_id AND course_id = g2.course_id)
WHERE g2.id = g.id
) > 0
THEN 1
ELSE 0
END group_started

FROM groups g
INNER JOIN courses c
ON c.id = g.course_id
WHERE g.event_id = 5003
AND g.round = 1
ORDER BY g.group_number


This is the way I'd like to do it to add functions and code-re-usability, but it's very slow for some reason. Am I missing something?

SELECT g.*, c.course_name, dbo.udf_checkGroupStarted(g.id) group_started
FROM groups g
INNER JOIN courses c
ON c.id = g.course_id
WHERE g.event_id = 5003
AND g.round = 1
ORDER BY g.group_number


Here is the actual UDF:

USE [scoring_dev]
GO
/****** Object: UserDefinedFunction [dbo].[udf_checkGroupStarted] Script Date: 12/03/2010 15:58:51 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO


-- =============================================
-- Author: <Author,,Name>
-- Create date: <Create Date, ,>
-- Description: <Description, ,>
-- =============================================
ALTER FUNCTION [dbo].[udf_checkGroupStarted]
(
-- Add the parameters for the function here
@groupid int
)
RETURNS bit
AS
BEGIN
-- Declare the return variable here
DECLARE @started bit

-- Add the T-SQL statements to compute the return value here

DECLARE @group_shots int
DECLARE @group_holes int

SET @group_shots = (
SELECT COUNT(*)
FROM groups g
INNER JOIN group_players gp
ON gp.group_id = g.id
INNER JOIN shots s
ON s.player_id = gp.player_id
AND s.round = g.round
AND s.hole_id IN (SELECT id FROM holes WHERE event_id = g.event_id AND course_id = g.course_id)
WHERE g.id = @groupid
)

SET @group_holes = (
SELECT COUNT(*)
FROM groups g
INNER JOIN group_players gp
ON gp.group_id = g.id
INNER JOIN player_holes ph
ON ph.player_id = gp.player_id
AND ph.round = g.round
AND ph.hole_id IN (SELECT id FROM holes WHERE event_id = g.event_id AND course_id = g.course_id)
WHERE g.id = @groupid
)

IF (@group_shots > 0)
SET @started = 1
ELSE IF (@group_holes > 0)
SET @started = 1
ELSE SET @started = 0
--SELECT @started

-- Return the result of the function
RETURN @started

END

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2010-12-03 : 16:09:23
The function has to be called for each row whereas in line cofe can be optimised with the rest of the query.
In genera functions are bad for performance.

They also mean that it's more dificult to see what the code is doing and to diagnose issues.

Saying that your function isn't very efficienly coded.
Instead of
if (select count(*) from ...) > 0
it is better to
if exists (select * from ...)

This can stop as soon as it finds a row. The former has to process the whole resultset to get the count.
Also if your first count > 0 there is no need to calculate the second.


==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

pixelwiz
Starting Member

25 Posts

Posted - 2010-12-03 : 16:19:29
Wow, I re-wrote the udf per your suggestion and it's lightning fast now. Amazing.

Thanks a lot, there are definitely some smart people on here.
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-12-03 : 18:36:09
quote:
Originally posted by pixelwiz

Hi All,

I've been trying to learn how to properly use stored procedures and functions. My understanding was that using udf functions inside my SQL code should speed things up because they become pre-compiled code and can execute faster.
<snip>

It depnds on what you are doing. But, I'd say, in general your preconceptions about performance are incorrect.

As you found out, executing a function for every row is very slow. If you are accessing tables/views inside the function then the optimizer still has to check statistics and such to try and come up with the best plan it can. The down side is that there are no statistics maintained about functions (that I know of), so that can drastically affect the execution plan.

Try doing some searching as there are different kinds of functions (Scalar, Table-valued (inline and multi-statement), CLR ...) and each has it's own performance implications.
Go to Top of Page
   

- Advertisement -