Convert while loop to CTE? - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

Convert while loop to CTE?

Author

Topic

ssunny
Posting Yak Master

133 Posts

Posted - 2012-09-28 : 15:18:59

Hello,
I was wondering if I could convert below query to CTE.

Here's my table structure with data:

-- table 1

test
-------------
aid pid
100 null
200 100
300 200
400 300
500 400
600 500

-- table 2

test1
--------------
aid aname
100 abc
200 abcd

-- query

declare @aid int,@aname varchar (100),@bid int
declate @temp table (aid int,aname varchar(100))
set @aid = 500
swt @bid = @aid

set @aname = 'null'

while @aname is null and @aid is not null
begin
set @aname = (select aname from test1 where aid = @aid)
if @aname is null
begin

set @aid = (select pid from test where aid = @aid)
end
end
insert into @temp
select @bid,@name

select * from @temp
go

Expected output for above case:

@temp
-------------
500 abcd

So idea here is to go up the hierarchy (using pid in "test" table) until you get a "aname" in test1 table for a given "aid".

Is there a better way of doing this? May be using a recursive CTE?

Thanks for your help!!

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2012-09-28 : 15:34:17

[code]-- **********************************************************************
-- Initialization of sample data
-- **********************************************************************

DECLARE @test TABLE (aid INT,pid INT);
DECLARE @test1 TABLE (aid INT, aname VARCHAR(50));

INSERT @test VALUES(100,NULL);
INSERT @test VALUES(200,100);
INSERT @test VALUES(300,200);
INSERT @test VALUES(400,300);
INSERT @test VALUES(500,400);
INSERT @test VALUES(600,500);

INSERT @test1 VALUES(100,'abc');
INSERT @test1 VALUES(200,'abcd');

-- **********************************************************************
-- CTE Example
-- **********************************************************************

DECLARE @param_example_aid INT;
SET @param_example_aid = 500;

WITH cte1
AS
(
-- anchor
SELECT t.aid Root, 0 AS LEVEL, t.aid, t.pid
FROM @test AS t
WHERE t.pid IS NULL
UNION ALL
SELECT c.root, c.Level+1 AS LEVEL, t.aid, t.pid
FROM @test AS t
JOIN cte1 AS c
ON t.pid = c.aid
)
, cte2
AS
(
SELECT c2.aid, n.aname, ROW_NUMBER() OVER(PARTITION BY c2.Root ORDER BY c2.Level DESC) AS Rn
FROM cte1 AS c1
JOIN cte1 AS c2
JOIN @test1 AS n ON c2.aid = n.aid
ON c1.Root = c2.root
WHERE c1.aid = @param_example_aid
)
SELECT z.aid, z.aname
FROM cte2 AS z
WHERE z.rn = 1[/code]

ssunny
Posting Yak Master

133 Posts

Posted - 2012-09-28 : 16:39:21

Thank you for your reply lazerath. Honestly I didn't fully understand your code. But I am trying. Will let you know.

Thank You again.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-09-28 : 17:00:01

solution using single CTE


declare @test table
(
aid int,
pid int
)
insert @test
values(100, null),
(200, 100),
(300, 200),
(400, 300),
(500, 400),
(600, 500)

-- table 2

declare @test1 table
(
aid int,
aname varchar(100)
)
insert @test1
values(100, 'abc'),
(200, 'abcd')

declare @ID int
set @ID = 500
;With CTE
AS
(
SELECT t.aid,t.pid,CAST(t.aid AS varchar(max)) AS [Path]
FROM @test t
where t.aid = @ID

union all

select t.aid,t.pid,c.Path + ',' + CAST(t.aid AS varchar(max))
from CTE c
inner join @test t
ON t.aid = c.pid
)


SELECT LEFT(Path,COALESCE(NULLIF(CHARINDEX(',',Path),0),LEN(Path)+1)-1) AS aid,
aname
FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY LEFT(Path,COALESCE(NULLIF(CHARINDEX(',',Path),0),LEN(Path)+1)-1) ORDER BY LEN(Path)) AS Rn
FROM CTE c
CROSS APPLY (SELECT aname
            FROM @test1
			WHERE aid = c.aid
			)c1
)t
WHERE Rn=1


output
----------------------------------------
aid	aname
500	abcd

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2012-09-28 : 17:18:22

Here is my solution with more detailed comments to help explain my code:

-- **********************************************************************
-- Initialization of sample data
-- **********************************************************************

DECLARE @test TABLE (aid INT,pid INT);
DECLARE @test1 TABLE (aid INT, aname VARCHAR(50));

INSERT @test VALUES(100,NULL);
INSERT @test VALUES(200,100);
INSERT @test VALUES(300,200);
INSERT @test VALUES(400,300);
INSERT @test VALUES(500,400);
INSERT @test VALUES(600,500);

INSERT @test1 VALUES(100,'abc');
INSERT @test1 VALUES(200,'abcd');



-- **********************************************************************
-- CTE Example
-- **********************************************************************

DECLARE @param_example_aid INT;
SET @param_example_aid = 500;

/* -- cteHierarchy
The Hierarchy CTE returns the all nodes below a root. The "Root" is the AID of
the original node. The root is provided so that your table can accomodate 
multiple hierarchies. An incrementing "Level" identifier is included as well
that indicates how far from the root a given node is.  This is helpful when
identifying 

Results:
Root LEVEL aid  pid
100  0     100  NULL  -- Our root node, notice the aid matches the "root" column
100  1     200  100
100  2     300  200
100  3     400  300
100  4     500  400
100  5     600  500

*/
WITH cteHierarchy
AS
(
-- Root / Anchor Level - Only entries with NULL parent.  Starts the recursion.
SELECT t.aid Root, 0 AS LEVEL, t.aid, t.pid
FROM @test AS t
WHERE t.pid IS NULL 
UNION ALL
-- The recursive query gets children of all current rows for each iteration
SELECT  c.root, c.Level+1 AS LEVEL, t.aid, t.pid
FROM @test AS t
JOIN cteHierarchy AS c  -- referring back to the cte causes recursion
ON t.pid = c.aid
)

/* -- cteReverseOrderedHierarchyPath
The Reverse Ordered Hierarchy Path CTE is a "chained cte" that uses cteHierarchy
to return all nodes in a hierarchy path starting from a given "node": @param_example_aid.
We identify our node as "c1" and get the entire hierarchy as "c2" via the "root".
This is necessary in case there are multiple hierarchies.   

Results of intermediary stage - left join to @test1:

C1 - Original Node      C2 - Hierarchy of C1    @test1
---------------------   ---------------------   -----------
Root  LEVEL  aid  pid | Root  LEVEL aid  pid  | aid   aname
----  -----  ---  --- | ----  ----- ---  ---  | ---   -----
100   4      500  400 | 100   0     100  NULL | 100   abc
100   4      500  400 | 100   1     200  100  | 200   abcd
100   4      500  400 | 100   2     300  200  | NULL  NULL
100   4      500  400 | 100   3     400  300  | NULL  NULL

-- other rows excluded from the resultset with "and c2.Level <= c1.Level"
100   4      500  400 | 100   4     500  400  | NULL  NULL
100   4      500  400 | 100   5     600  500  | NULL  NULL


Once we have the hierarchy, we inner join it to @test1, which reduces the 
resultset to only entries that have a name ("aname").  Then, we apply the
ROW_NUMBER() windowing function to "reverse order" the entries by level,
so that the first entry encountered UP the hierarchy will always have a
Row Number of 1.

Results of final stage - inner join to @test1:

C1 - Original Node      C2 - Hierarchy of C1    @test1        Reverse Order
---------------------   ---------------------   -----------   -------------
Root  LEVEL  aid  pid | Root  LEVEL aid  pid  | aid   aname | rn
----  -----  ---  --- | ----  ----- ---  ---  | ---   ----- | -------------
100   4      500  400 | 100   0     100  NULL | 100   abc   | 2
100   4      500  400 | 100   1     200  100  | 200   abcd  | 1
*/

, cteReverseOrderedHierarchyPath
AS
(
SELECT c2.aid, n.aname, ROW_NUMBER() OVER(PARTITION BY c2.Root ORDER BY c2.Level DESC) AS Rn
FROM cteHierarchy AS c1
JOIN cteHierarchy AS c2 ON c1.Root = c2.Root
JOIN @test1 AS n ON c2.aid = n.aid
WHERE c1.aid = @param_example_aid 
AND c2.Level <= c1.Level
)
/* 
Finally, we select the entry that has a rn = 1 and get our "first encountered"
aname. Note that aid of the row found will be returned.  In your example, you
simply returned the @param_example_aid.  Easy enough to change if needed, but any
code passing in @param_example_aid should already have it.

Results:
aid  aname
200  abcd
*/
SELECT z.aid, z.aname
FROM cteReverseOrderedHierarchyPath AS z
WHERE z.rn = 1

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2012-09-28 : 17:35:03

visakh16 has an interesting solution, but it is mixing derived tables and ctes, which is a design pattern I try to avoid. Once you become familiar with CTEs, they are easier to read and maintain because it's a top down "flow" vs the nesting behavior in derived tables. There is no performance difference between the two.

There is a problem in this solution you should be aware of. Currently, it relies on the length of "PATH", which in your current sample data is fixed. However, if you begin having AIDs with differing lengths (such as 1, 10, 100, 1000, 10000), the PATH length may not correspond to level depending on the hierarchy structure. I try to avoid putting any dependency on externalalities such as surrogate key length and sequence so as to avoid inflexibility in my solutions and hard to diagnose bugs that crop up after data changes.

Additionally, the CROSS APPLY is superfluous. I'd simply use a JOIN. The only reason it exists is so that the solution can use SELECT * without creating ambiguous column names.

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2012-09-28 : 17:50:06

Here's a solution that combines the best parts of visakh16's solution with my own. His solution starts from the passed in @ID, whereas mine generates a hierarchy for all of the data and then filters it. That part of his solution will certainly be more efficient than my own (however, you could use my original code as a starting point for finding the "aname" for all rows in the table in a set based fashion).

Going this route allows me to simply my code significantly:

-- **********************************************************************
-- Initialization of sample data
-- **********************************************************************

DECLARE @test TABLE (aid INT,pid INT);
DECLARE @test1 TABLE (aid INT, aname VARCHAR(50));

INSERT @test VALUES(100,NULL);
INSERT @test VALUES(200,100);
INSERT @test VALUES(300,200);
INSERT @test VALUES(400,300);
INSERT @test VALUES(500,400);
INSERT @test VALUES(600,500);

INSERT @test1 VALUES(100,'abc');
INSERT @test1 VALUES(200,'abcd');



-- **********************************************************************
-- CTE Example
-- **********************************************************************

DECLARE @param_example_aid INT;
SET @param_example_aid = 500;

/* -- cteHierarchy
The Hierarchy CTE returns the all nodes above the param. The "Root" is the AID of
the original node. An incrementing "Level" identifier is included as well
that indicates how far from the root a given node is.  

Results:
Root LEVEL	aid	pid
500  0      500	400
500  1      400	300
500  2      300	200
500  3      200	100
500  4      100	NULL

*/
WITH cteHierarchy
AS
(
-- Root / Anchor Level - start with @param_example_aid.
SELECT t.aid Root, 0 AS LEVEL, t.aid, t.pid
FROM @test AS t
WHERE t.aid = @param_example_aid
UNION ALL
-- The recursive query gets parents of anchor 
SELECT  c.root, c.Level+1 AS LEVEL, t.aid, t.pid
FROM @test AS t
JOIN cteHierarchy AS c  -- referring back to the cte causes recursion
ON t.aid = c.pid
)

/* -- cteOrderedHierarchyPath
The Reverse Ordered Hierarchy Path CTE is a "chained cte" that uses cteHierarchy
to return all nodes in a hierarchy path starting from a given "node": @param_example_aid.
We identify our node as "c1" and get the entire hierarchy as "c2" via the "root".
This is necessary in case there are multiple hierarchies.   

*/
, cteOrderedHierarchyPath
AS
(
SELECT c1.aid, n.aname, ROW_NUMBER() OVER(PARTITION BY c1.Root ORDER BY c1.Level) AS Rn
FROM cteHierarchy AS c1
JOIN @test1 AS n ON c1.aid = n.aid
)
/* 
Finally, we select the entry that has a rn = 1 and get our "first encountered"
aname. Note that aid of the row found will be returned.  In your example, you
simply returned the @param_example_aid.  Easy enough to change if needed, but any
code passing in @param_example_aid should already have it.
*/
SELECT z.aid, z.aname
FROM cteOrderedHierarchyPath AS z
WHERE z.rn = 1

edit: It's worth noting that this solution eliminates the lengthy expressions required with the path. If you want to return 500, just use z.root or @param_example_aid directly. I still think 500 is already "known" information, so providing "200" gives additional information to the application that might be useful at some point. You could do both.

malpashaa
Constraint Violating Yak Guru

264 Posts

Posted - 2012-09-28 : 23:53:34

Try this query:


DECLARE @aid INT = 500;

WITH CTE AS
(
   SELECT @aid AS aid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = @aid) AS aname

    UNION ALL

   SELECT T.pid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = T.pid)
     FROM CTE AS P
          INNER JOIN
          test AS T
          ON T.aid = P.aid
    WHERE P.aname IS NULL
)
SELECT @aid AS aid, T.aname
  FROM CTE AS T
 WHERE T.aname IS NOT NULL;

It is simpler, faster, and it matches your original code.

For us, there is only the trying. The rest is not our business. ~T.S. Eliot

Muhammad Al Pasha

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2012-09-29 : 03:22:36

No. Post code that can be replayed, like my own, else you arent even playing.

malpashaa
Constraint Violating Yak Guru

264 Posts

Posted - 2012-09-29 : 11:39:18

quote:
Originally posted by lazerath

No. Post code that can be replayed, like my own, else you arent even playing.


USE tempdb;

IF OBJECT_ID(N'tempdb..test', N'U') IS NULL
   SELECT T.aid, T.pid
     INTO tempdb..test
     FROM (VALUES(100, NULL), (200, 100), (300, 200), (400, 300), (500, 400), (600, 500)) AS T(aid, pid);

IF OBJECT_ID(N'tempdb..test1', N'U') IS NULL
   SELECT T.aid, T.aname
     INTO tempdb..test1
     FROM (VALUES(100, 'abc'), (200, 'abcd')) AS T(aid, aname);

DECLARE @aid INT = 500;

WITH CTE AS
(
   SELECT @aid AS aid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = @aid) AS aname

    UNION ALL

   SELECT T.pid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = T.pid)
     FROM CTE AS P
          INNER JOIN
          test AS T
          ON T.aid = P.aid
    WHERE P.aname IS NULL
)
SELECT @aid AS aid, T.aname
  FROM CTE AS T
 WHERE T.aname IS NOT NULL;

For us, there is only the trying. The rest is not our business. ~T.S. Eliot

Muhammad Al Pasha

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-10-01 : 10:40:27

quote:
Originally posted by lazerath

visakh16 has an interesting solution, but it is mixing derived tables and ctes, which is a design pattern I try to avoid. Once you become familiar with CTEs, they are easier to read and maintain because it's a top down "flow" vs the nesting behavior in derived tables. There is no performance difference between the two.

There is a problem in this solution you should be aware of. Currently, it relies on the length of "PATH", which in your current sample data is fixed. However, if you begin having AIDs with differing lengths (such as 1, 10, 100, 1000, 10000), the PATH length may not correspond to level depending on the hierarchy structure. I try to avoid putting any dependency on externalalities such as surrogate key length and sequence so as to avoid inflexibility in my solutions and hard to diagnose bugs that crop up after data changes.

Additionally, the CROSS APPLY is superfluous. I'd simply use a JOIN. The only reason it exists is so that the solution can use SELECT * without creating ambiguous column names.

I agree to length point. if you want to be more precise you could add a level field of type int in CTE and use it for ordering
Sorry didnt understand deal with CROSS APPLY though Can you explain?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

ssunny
Posting Yak Master

133 Posts

Posted - 2012-10-01 : 12:04:54

Wow!! Many solutions to choose from. You guys rock.Thank you.

ssunny
Posting Yak Master

133 Posts

Posted - 2012-10-01 : 12:06:07

And super explanation with your solution lazerath. Really easy to understand.

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2012-10-01 : 13:28:45

@visakh16: While using CROSS APPLY within the solution was technically possible, my point is that it over complicates the solution. In T-SQL, there are often many ways to do achieve the same result, so simplicity is a virtue. In this case, a simple JOIN could be used in it's place -- eliminating a SELECT * which is not a best practice -- and it would make the solution easier to understand. This, in turn, improves the effectiveness of your assistance and helps the poster on their road to mastery.

In general, I also see CROSS APPLY misunderstood a lot, so the more it can be used in situations where it is actually the best option, the better we can correct misunderstandings and improve the general knowledge of the community. As such, I try to avoid using it unless it actually is the best option.

I will grant you that this is more of a philosophical point, but at our level of the craft I consider it worth bringing up.

@malpashaa: Thanks for submitting your solution. It is definitely simpler and faster. It also gave me something to think about and showed me another way of working with recursive loops.

Secondly, my apologies for being curt with you. I could have worded my reply far better.

@ssunny: Glad to be of assistance!

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-10-01 : 13:43:37

quote:
Originally posted by lazerath

@visakh16: While using CROSS APPLY within the solution was technically possible, my point is that it over complicates the solution. In T-SQL, there are often many ways to do achieve the same result, so simplicity is a virtue. In this case, a simple JOIN could be used in it's place -- eliminating a SELECT * which is not a best practice -- and it would make the solution easier to understand. This, in turn, improves the effectiveness of your assistance and helps the poster on their road to mastery.

In general, I also see CROSS APPLY misunderstood a lot, so the more it can be used in situations where it is actually the best option, the better we can correct misunderstandings and improve the general knowledge of the community. As such, I try to avoid using it unless it actually is the best option.

I will grant you that this is more of a philosophical point, but at our level of the craft I consider it worth bringing up.

@malpashaa: Thanks for submitting your solution. It is definitely simpler and faster. It also gave me something to think about and showed me another way of working with recursive loops.

Secondly, my apologies for being curt with you. I could have worded my reply far better.

@ssunny: Glad to be of assistance!

I would say its personal preference
For people who're more comfortable with subquery approach they might favour APPLY over JOIN. In this case, using either of them would be equivalent performance wise.
But I agree with you that here JOIN can be used instead of APPLY to make query ANSI based. But anyways its really nice to know cool ways of doing things using new features as well. Mostly often I've heard people telling APPLY can be used only with UDFs because thats examples that most people have been taught with. Thats what happens when you dont know where all you can apply a particular feature on!
But if one of the way was more efficient I would have definitely recommended going for it though.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources