UDF vs subquery - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

UDF vs subquery

Author

Topic

pixelwiz
Starting Member

25 Posts

Posted - 2010-12-03 : 16:00:22

Hi All,

I've been trying to learn how to properly use stored procedures and functions. My understanding was that using udf functions inside my SQL code should speed things up because they become pre-compiled code and can execute faster.

However, I've recently ran into a scenario where it is the exact opposite. If I have a calculated column determined by a sub-query it runs in a split second. However, if I use a UDF, it takes 8 seconds to do the exact same thing.

I'll try to paste the code below for both cases.

This runs real fast

      SELECT g.*, c.course_name,
			CASE WHEN 
				(
					SELECT COUNT(*)
					FROM groups g2
					INNER JOIN group_players gp
					ON gp.group_id = g2.id
					INNER JOIN shots s
					ON s.player_id = gp.player_id
					AND s.round = g2.round
					AND s.hole_id IN (SELECT id FROM holes WHERE event_id = g2.event_id AND course_id = g2.course_id)
					WHERE g2.id = g.id
				) > 0 
				OR 
				(
					SELECT COUNT(*)
					FROM groups g2
					INNER JOIN group_players gp
					ON gp.group_id = g2.id
					INNER JOIN player_holes ph
					ON ph.player_id = gp.player_id
					AND ph.round = g2.round
					AND ph.hole_id IN (SELECT id FROM holes WHERE event_id = g2.event_id AND course_id = g2.course_id)
					WHERE g2.id = g.id
				) > 0 
			THEN 1
			ELSE 0
			END group_started
			
			FROM groups g
			INNER JOIN courses c
			ON c.id = g.course_id
     	WHERE g.event_id = 5003
      AND g.round = 1 
      ORDER BY g.group_number

This is the way I'd like to do it to add functions and code-re-usability, but it's very slow for some reason. Am I missing something?


      SELECT g.*, c.course_name, dbo.udf_checkGroupStarted(g.id) group_started
			FROM groups g
			INNER JOIN courses c
			ON c.id = g.course_id
     	WHERE g.event_id = 5003
      AND g.round = 1 
      ORDER BY g.group_number

Here is the actual UDF:


USE [scoring_dev]
GO
/****** Object:  UserDefinedFunction [dbo].[udf_checkGroupStarted]    Script Date: 12/03/2010 15:58:51 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO


-- =============================================
-- Author:		<Author,,Name>
-- Create date: <Create Date, ,>
-- Description:	<Description, ,>
-- =============================================
ALTER FUNCTION [dbo].[udf_checkGroupStarted] 
(
	-- Add the parameters for the function here
	@groupid int
)
RETURNS bit
AS
BEGIN
	-- Declare the return variable here
	DECLARE @started bit

	-- Add the T-SQL statements to compute the return value here
	
	DECLARE @group_shots int
	DECLARE @group_holes int
	
	SET @group_shots = (
		SELECT COUNT(*)
		FROM groups g
		INNER JOIN group_players gp
		ON gp.group_id = g.id
		INNER JOIN shots s
		ON s.player_id = gp.player_id
		AND s.round = g.round
		AND s.hole_id IN (SELECT id FROM holes WHERE event_id = g.event_id AND course_id = g.course_id)
		WHERE g.id = @groupid
	)

	SET @group_holes = (
			SELECT COUNT(*)
			FROM groups g
			INNER JOIN group_players gp
			ON gp.group_id = g.id
			INNER JOIN player_holes ph
			ON ph.player_id = gp.player_id
			AND ph.round = g.round
			AND ph.hole_id IN (SELECT id FROM holes WHERE event_id = g.event_id AND course_id = g.course_id)
			WHERE g.id = @groupid
	)

	IF (@group_shots > 0) 
		SET @started = 1
	ELSE IF (@group_holes > 0)
		SET @started = 1
	ELSE SET @started = 0
	--SELECT @started
	
	-- Return the result of the function
	RETURN @started

END

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2010-12-03 : 16:09:23

The function has to be called for each row whereas in line cofe can be optimised with the rest of the query.
In genera functions are bad for performance.

They also mean that it's more dificult to see what the code is doing and to diagnose issues.

Saying that your function isn't very efficienly coded.
Instead of
if (select count(*) from ...) > 0
it is better to
if exists (select * from ...)

This can stop as soon as it finds a row. The former has to process the whole resultset to get the count.
Also if your first count > 0 there is no need to calculate the second.

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.

pixelwiz
Starting Member

25 Posts

Posted - 2010-12-03 : 16:19:29

Wow, I re-wrote the udf per your suggestion and it's lightning fast now. Amazing.

Thanks a lot, there are definitely some smart people on here.

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-12-03 : 18:36:09

quote:
Originally posted by pixelwiz

Hi All,

I've been trying to learn how to properly use stored procedures and functions. My understanding was that using udf functions inside my SQL code should speed things up because they become pre-compiled code and can execute faster.
<snip>

It depnds on what you are doing. But, I'd say, in general your preconceptions about performance are incorrect.

As you found out, executing a function for every row is very slow. If you are accessing tables/views inside the function then the optimizer still has to check statistics and such to try and come up with the best plan it can. The down side is that there are no statistics maintained about functions (that I know of), so that can drastically affect the execution plan.

Try doing some searching as there are different kinds of functions (Scalar, Table-valued (inline and multi-statement), CLR ...) and each has it's own performance implications.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources