| Author |
Topic  |
|
|
ssunny
Posting Yak Master
USA
130 Posts |
Posted - 09/28/2012 : 15:18:59
|
Hello, I was wondering if I could convert below query to CTE.
Here's my table structure with data:
-- table 1
test ------------- aid pid 100 null 200 100 300 200 400 300 500 400 600 500
-- table 2
test1 -------------- aid aname 100 abc 200 abcd
-- query
declare @aid int,@aname varchar (100),@bid int declate @temp table (aid int,aname varchar(100)) set @aid = 500 swt @bid = @aid
set @aname = 'null'
while @aname is null and @aid is not null begin set @aname = (select aname from test1 where aid = @aid) if @aname is null begin set @aid = (select pid from test where aid = @aid) end end insert into @temp select @bid,@name
select * from @temp go
Expected output for above case:
@temp ------------- 500 abcd
So idea here is to go up the hierarchy (using pid in "test" table) until you get a "aname" in test1 table for a given "aid".
Is there a better way of doing this? May be using a recursive CTE?
Thanks for your help!!
|
|
|
lazerath
Constraint Violating Yak Guru
USA
278 Posts |
Posted - 09/28/2012 : 15:34:17
|
-- **********************************************************************
-- Initialization of sample data
-- **********************************************************************
DECLARE @test TABLE (aid INT,pid INT);
DECLARE @test1 TABLE (aid INT, aname VARCHAR(50));
INSERT @test VALUES(100,NULL);
INSERT @test VALUES(200,100);
INSERT @test VALUES(300,200);
INSERT @test VALUES(400,300);
INSERT @test VALUES(500,400);
INSERT @test VALUES(600,500);
INSERT @test1 VALUES(100,'abc');
INSERT @test1 VALUES(200,'abcd');
-- **********************************************************************
-- CTE Example
-- **********************************************************************
DECLARE @param_example_aid INT;
SET @param_example_aid = 500;
WITH cte1
AS
(
-- anchor
SELECT t.aid Root, 0 AS LEVEL, t.aid, t.pid
FROM @test AS t
WHERE t.pid IS NULL
UNION ALL
SELECT c.root, c.Level+1 AS LEVEL, t.aid, t.pid
FROM @test AS t
JOIN cte1 AS c
ON t.pid = c.aid
)
, cte2
AS
(
SELECT c2.aid, n.aname, ROW_NUMBER() OVER(PARTITION BY c2.Root ORDER BY c2.Level DESC) AS Rn
FROM cte1 AS c1
JOIN cte1 AS c2
JOIN @test1 AS n ON c2.aid = n.aid
ON c1.Root = c2.root
WHERE c1.aid = @param_example_aid
)
SELECT z.aid, z.aname
FROM cte2 AS z
WHERE z.rn = 1 |
 |
|
|
ssunny
Posting Yak Master
USA
130 Posts |
Posted - 09/28/2012 : 16:39:21
|
Thank you for your reply lazerath. Honestly I didn't fully understand your code. But I am trying. Will let you know.
Thank You again. |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47040 Posts |
Posted - 09/28/2012 : 17:00:01
|
solution using single CTE
declare @test table
(
aid int,
pid int
)
insert @test
values(100, null),
(200, 100),
(300, 200),
(400, 300),
(500, 400),
(600, 500)
-- table 2
declare @test1 table
(
aid int,
aname varchar(100)
)
insert @test1
values(100, 'abc'),
(200, 'abcd')
declare @ID int
set @ID = 500
;With CTE
AS
(
SELECT t.aid,t.pid,CAST(t.aid AS varchar(max)) AS [Path]
FROM @test t
where t.aid = @ID
union all
select t.aid,t.pid,c.Path + ',' + CAST(t.aid AS varchar(max))
from CTE c
inner join @test t
ON t.aid = c.pid
)
SELECT LEFT(Path,COALESCE(NULLIF(CHARINDEX(',',Path),0),LEN(Path)+1)-1) AS aid,
aname
FROM
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY LEFT(Path,COALESCE(NULLIF(CHARINDEX(',',Path),0),LEN(Path)+1)-1) ORDER BY LEN(Path)) AS Rn
FROM CTE c
CROSS APPLY (SELECT aname
FROM @test1
WHERE aid = c.aid
)c1
)t
WHERE Rn=1
output
----------------------------------------
aid aname
500 abcd
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
 |
|
|
lazerath
Constraint Violating Yak Guru
USA
278 Posts |
Posted - 09/28/2012 : 17:18:22
|
Here is my solution with more detailed comments to help explain my code:
-- **********************************************************************
-- Initialization of sample data
-- **********************************************************************
DECLARE @test TABLE (aid INT,pid INT);
DECLARE @test1 TABLE (aid INT, aname VARCHAR(50));
INSERT @test VALUES(100,NULL);
INSERT @test VALUES(200,100);
INSERT @test VALUES(300,200);
INSERT @test VALUES(400,300);
INSERT @test VALUES(500,400);
INSERT @test VALUES(600,500);
INSERT @test1 VALUES(100,'abc');
INSERT @test1 VALUES(200,'abcd');
-- **********************************************************************
-- CTE Example
-- **********************************************************************
DECLARE @param_example_aid INT;
SET @param_example_aid = 500;
/* -- cteHierarchy
The Hierarchy CTE returns the all nodes below a root. The "Root" is the AID of
the original node. The root is provided so that your table can accomodate
multiple hierarchies. An incrementing "Level" identifier is included as well
that indicates how far from the root a given node is. This is helpful when
identifying
Results:
Root LEVEL aid pid
100 0 100 NULL -- Our root node, notice the aid matches the "root" column
100 1 200 100
100 2 300 200
100 3 400 300
100 4 500 400
100 5 600 500
*/
WITH cteHierarchy
AS
(
-- Root / Anchor Level - Only entries with NULL parent. Starts the recursion.
SELECT t.aid Root, 0 AS LEVEL, t.aid, t.pid
FROM @test AS t
WHERE t.pid IS NULL
UNION ALL
-- The recursive query gets children of all current rows for each iteration
SELECT c.root, c.Level+1 AS LEVEL, t.aid, t.pid
FROM @test AS t
JOIN cteHierarchy AS c -- referring back to the cte causes recursion
ON t.pid = c.aid
)
/* -- cteReverseOrderedHierarchyPath
The Reverse Ordered Hierarchy Path CTE is a "chained cte" that uses cteHierarchy
to return all nodes in a hierarchy path starting from a given "node": @param_example_aid.
We identify our node as "c1" and get the entire hierarchy as "c2" via the "root".
This is necessary in case there are multiple hierarchies.
Results of intermediary stage - left join to @test1:
C1 - Original Node C2 - Hierarchy of C1 @test1
--------------------- --------------------- -----------
Root LEVEL aid pid | Root LEVEL aid pid | aid aname
---- ----- --- --- | ---- ----- --- --- | --- -----
100 4 500 400 | 100 0 100 NULL | 100 abc
100 4 500 400 | 100 1 200 100 | 200 abcd
100 4 500 400 | 100 2 300 200 | NULL NULL
100 4 500 400 | 100 3 400 300 | NULL NULL
-- other rows excluded from the resultset with "and c2.Level <= c1.Level"
100 4 500 400 | 100 4 500 400 | NULL NULL
100 4 500 400 | 100 5 600 500 | NULL NULL
Once we have the hierarchy, we inner join it to @test1, which reduces the
resultset to only entries that have a name ("aname"). Then, we apply the
ROW_NUMBER() windowing function to "reverse order" the entries by level,
so that the first entry encountered UP the hierarchy will always have a
Row Number of 1.
Results of final stage - inner join to @test1:
C1 - Original Node C2 - Hierarchy of C1 @test1 Reverse Order
--------------------- --------------------- ----------- -------------
Root LEVEL aid pid | Root LEVEL aid pid | aid aname | rn
---- ----- --- --- | ---- ----- --- --- | --- ----- | -------------
100 4 500 400 | 100 0 100 NULL | 100 abc | 2
100 4 500 400 | 100 1 200 100 | 200 abcd | 1
*/
, cteReverseOrderedHierarchyPath
AS
(
SELECT c2.aid, n.aname, ROW_NUMBER() OVER(PARTITION BY c2.Root ORDER BY c2.Level DESC) AS Rn
FROM cteHierarchy AS c1
JOIN cteHierarchy AS c2 ON c1.Root = c2.Root
JOIN @test1 AS n ON c2.aid = n.aid
WHERE c1.aid = @param_example_aid
AND c2.Level <= c1.Level
)
/*
Finally, we select the entry that has a rn = 1 and get our "first encountered"
aname. Note that aid of the row found will be returned. In your example, you
simply returned the @param_example_aid. Easy enough to change if needed, but any
code passing in @param_example_aid should already have it.
Results:
aid aname
200 abcd
*/
SELECT z.aid, z.aname
FROM cteReverseOrderedHierarchyPath AS z
WHERE z.rn = 1 |
 |
|
|
lazerath
Constraint Violating Yak Guru
USA
278 Posts |
Posted - 09/28/2012 : 17:35:03
|
visakh16 has an interesting solution, but it is mixing derived tables and ctes, which is a design pattern I try to avoid. Once you become familiar with CTEs, they are easier to read and maintain because it's a top down "flow" vs the nesting behavior in derived tables. There is no performance difference between the two.
There is a problem in this solution you should be aware of. Currently, it relies on the length of "PATH", which in your current sample data is fixed. However, if you begin having AIDs with differing lengths (such as 1, 10, 100, 1000, 10000), the PATH length may not correspond to level depending on the hierarchy structure. I try to avoid putting any dependency on externalalities such as surrogate key length and sequence so as to avoid inflexibility in my solutions and hard to diagnose bugs that crop up after data changes.
Additionally, the CROSS APPLY is superfluous. I'd simply use a JOIN. The only reason it exists is so that the solution can use SELECT * without creating ambiguous column names. |
 |
|
|
lazerath
Constraint Violating Yak Guru
USA
278 Posts |
Posted - 09/28/2012 : 17:50:06
|
Here's a solution that combines the best parts of visakh16's solution with my own. His solution starts from the passed in @ID, whereas mine generates a hierarchy for all of the data and then filters it. That part of his solution will certainly be more efficient than my own (however, you could use my original code as a starting point for finding the "aname" for all rows in the table in a set based fashion).
Going this route allows me to simply my code significantly:
-- **********************************************************************
-- Initialization of sample data
-- **********************************************************************
DECLARE @test TABLE (aid INT,pid INT);
DECLARE @test1 TABLE (aid INT, aname VARCHAR(50));
INSERT @test VALUES(100,NULL);
INSERT @test VALUES(200,100);
INSERT @test VALUES(300,200);
INSERT @test VALUES(400,300);
INSERT @test VALUES(500,400);
INSERT @test VALUES(600,500);
INSERT @test1 VALUES(100,'abc');
INSERT @test1 VALUES(200,'abcd');
-- **********************************************************************
-- CTE Example
-- **********************************************************************
DECLARE @param_example_aid INT;
SET @param_example_aid = 500;
/* -- cteHierarchy
The Hierarchy CTE returns the all nodes above the param. The "Root" is the AID of
the original node. An incrementing "Level" identifier is included as well
that indicates how far from the root a given node is.
Results:
Root LEVEL aid pid
500 0 500 400
500 1 400 300
500 2 300 200
500 3 200 100
500 4 100 NULL
*/
WITH cteHierarchy
AS
(
-- Root / Anchor Level - start with @param_example_aid.
SELECT t.aid Root, 0 AS LEVEL, t.aid, t.pid
FROM @test AS t
WHERE t.aid = @param_example_aid
UNION ALL
-- The recursive query gets parents of anchor
SELECT c.root, c.Level+1 AS LEVEL, t.aid, t.pid
FROM @test AS t
JOIN cteHierarchy AS c -- referring back to the cte causes recursion
ON t.aid = c.pid
)
/* -- cteOrderedHierarchyPath
The Reverse Ordered Hierarchy Path CTE is a "chained cte" that uses cteHierarchy
to return all nodes in a hierarchy path starting from a given "node": @param_example_aid.
We identify our node as "c1" and get the entire hierarchy as "c2" via the "root".
This is necessary in case there are multiple hierarchies.
*/
, cteOrderedHierarchyPath
AS
(
SELECT c1.aid, n.aname, ROW_NUMBER() OVER(PARTITION BY c1.Root ORDER BY c1.Level) AS Rn
FROM cteHierarchy AS c1
JOIN @test1 AS n ON c1.aid = n.aid
)
/*
Finally, we select the entry that has a rn = 1 and get our "first encountered"
aname. Note that aid of the row found will be returned. In your example, you
simply returned the @param_example_aid. Easy enough to change if needed, but any
code passing in @param_example_aid should already have it.
*/
SELECT z.aid, z.aname
FROM cteOrderedHierarchyPath AS z
WHERE z.rn = 1
edit: It's worth noting that this solution eliminates the lengthy expressions required with the path. If you want to return 500, just use z.root or @param_example_aid directly. I still think 500 is already "known" information, so providing "200" gives additional information to the application that might be useful at some point. You could do both. |
Edited by - lazerath on 09/28/2012 17:54:47 |
 |
|
|
malpashaa
Constraint Violating Yak Guru
Saudi Arabia
257 Posts |
Posted - 09/28/2012 : 23:53:34
|
Try this query:
DECLARE @aid INT = 500;
WITH CTE AS
(
SELECT @aid AS aid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = @aid) AS aname
UNION ALL
SELECT T.pid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = T.pid)
FROM CTE AS P
INNER JOIN
test AS T
ON T.aid = P.aid
WHERE P.aname IS NULL
)
SELECT @aid AS aid, T.aname
FROM CTE AS T
WHERE T.aname IS NOT NULL;
It is simpler, faster, and it matches your original code.
For us, there is only the trying. The rest is not our business. ~T.S. Eliot
Muhammad Al Pasha |
 |
|
|
lazerath
Constraint Violating Yak Guru
USA
278 Posts |
Posted - 09/29/2012 : 03:22:36
|
| No. Post code that can be replayed, like my own, else you arent even playing. |
 |
|
|
malpashaa
Constraint Violating Yak Guru
Saudi Arabia
257 Posts |
Posted - 09/29/2012 : 11:39:18
|
quote: Originally posted by lazerath
No. Post code that can be replayed, like my own, else you arent even playing.
USE tempdb;
IF OBJECT_ID(N'tempdb..test', N'U') IS NULL
SELECT T.aid, T.pid
INTO tempdb..test
FROM (VALUES(100, NULL), (200, 100), (300, 200), (400, 300), (500, 400), (600, 500)) AS T(aid, pid);
IF OBJECT_ID(N'tempdb..test1', N'U') IS NULL
SELECT T.aid, T.aname
INTO tempdb..test1
FROM (VALUES(100, 'abc'), (200, 'abcd')) AS T(aid, aname);
DECLARE @aid INT = 500;
WITH CTE AS
(
SELECT @aid AS aid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = @aid) AS aname
UNION ALL
SELECT T.pid, (SELECT T1.aname FROM test1 AS T1 WHERE T1.aid = T.pid)
FROM CTE AS P
INNER JOIN
test AS T
ON T.aid = P.aid
WHERE P.aname IS NULL
)
SELECT @aid AS aid, T.aname
FROM CTE AS T
WHERE T.aname IS NOT NULL;
For us, there is only the trying. The rest is not our business. ~T.S. Eliot
Muhammad Al Pasha |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47040 Posts |
Posted - 10/01/2012 : 10:40:27
|
quote: Originally posted by lazerath
visakh16 has an interesting solution, but it is mixing derived tables and ctes, which is a design pattern I try to avoid. Once you become familiar with CTEs, they are easier to read and maintain because it's a top down "flow" vs the nesting behavior in derived tables. There is no performance difference between the two.
There is a problem in this solution you should be aware of. Currently, it relies on the length of "PATH", which in your current sample data is fixed. However, if you begin having AIDs with differing lengths (such as 1, 10, 100, 1000, 10000), the PATH length may not correspond to level depending on the hierarchy structure. I try to avoid putting any dependency on externalalities such as surrogate key length and sequence so as to avoid inflexibility in my solutions and hard to diagnose bugs that crop up after data changes.
Additionally, the CROSS APPLY is superfluous. I'd simply use a JOIN. The only reason it exists is so that the solution can use SELECT * without creating ambiguous column names.
I agree to length point. if you want to be more precise you could add a level field of type int in CTE and use it for ordering Sorry didnt understand deal with CROSS APPLY though Can you explain?
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
 |
|
|
ssunny
Posting Yak Master
USA
130 Posts |
Posted - 10/01/2012 : 12:04:54
|
| Wow!! Many solutions to choose from. You guys rock.Thank you. |
 |
|
|
ssunny
Posting Yak Master
USA
130 Posts |
Posted - 10/01/2012 : 12:06:07
|
| And super explanation with your solution lazerath. Really easy to understand. |
 |
|
|
lazerath
Constraint Violating Yak Guru
USA
278 Posts |
Posted - 10/01/2012 : 13:28:45
|
@visakh16: While using CROSS APPLY within the solution was technically possible, my point is that it over complicates the solution. In T-SQL, there are often many ways to do achieve the same result, so simplicity is a virtue. In this case, a simple JOIN could be used in it's place -- eliminating a SELECT * which is not a best practice -- and it would make the solution easier to understand. This, in turn, improves the effectiveness of your assistance and helps the poster on their road to mastery.
In general, I also see CROSS APPLY misunderstood a lot, so the more it can be used in situations where it is actually the best option, the better we can correct misunderstandings and improve the general knowledge of the community. As such, I try to avoid using it unless it actually is the best option.
I will grant you that this is more of a philosophical point, but at our level of the craft I consider it worth bringing up.
@malpashaa: Thanks for submitting your solution. It is definitely simpler and faster. It also gave me something to think about and showed me another way of working with recursive loops.
Secondly, my apologies for being curt with you. I could have worded my reply far better.
@ssunny: Glad to be of assistance! |
Edited by - lazerath on 10/01/2012 13:30:58 |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47040 Posts |
Posted - 10/01/2012 : 13:43:37
|
quote: Originally posted by lazerath
@visakh16: While using CROSS APPLY within the solution was technically possible, my point is that it over complicates the solution. In T-SQL, there are often many ways to do achieve the same result, so simplicity is a virtue. In this case, a simple JOIN could be used in it's place -- eliminating a SELECT * which is not a best practice -- and it would make the solution easier to understand. This, in turn, improves the effectiveness of your assistance and helps the poster on their road to mastery.
In general, I also see CROSS APPLY misunderstood a lot, so the more it can be used in situations where it is actually the best option, the better we can correct misunderstandings and improve the general knowledge of the community. As such, I try to avoid using it unless it actually is the best option.
I will grant you that this is more of a philosophical point, but at our level of the craft I consider it worth bringing up.
@malpashaa: Thanks for submitting your solution. It is definitely simpler and faster. It also gave me something to think about and showed me another way of working with recursive loops.
Secondly, my apologies for being curt with you. I could have worded my reply far better.
@ssunny: Glad to be of assistance!
I would say its personal preference For people who're more comfortable with subquery approach they might favour APPLY over JOIN. In this case, using either of them would be equivalent performance wise. But I agree with you that here JOIN can be used instead of APPLY to make query ANSI based. But anyways its really nice to know cool ways of doing things using new features as well. Mostly often I've heard people telling APPLY can be used only with UDFs because thats examples that most people have been taught with. Thats what happens when you dont know where all you can apply a particular feature on! But if one of the way was more efficient I would have definitely recommended going for it though.
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
 |
|
| |
Topic  |
|
|
|