Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Really Tricky! Group-Hierarchy relationship rank.

Author  Topic 

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-21 : 00:17:31
OK, I am starting to brainstorm trying to accomplish the old
"If tom is taller than john and john is taller than Ann, is Tom taller than Ann?"

The only thing is I am trying to do this at a large scale with thousands of names.

I need to determine for each name if it is definite, if not enough information exists such as if all you knew was the first statement I made and then someone asked "Is Tom taller than Jake" Since we have no facts on Jake, it is UNKNOWN.


The data is going to be presented in 3 person group_increments indicated by a grouping id. from the 3 person group each person will have a rank of 1 2 or 3. 1 is the tallest (or whatever you want to call it.

I need to figure out a rank from 1 to XYZ of who the tallest personid is to the shortest out of ALL personID's, and who I do not have enough information to figure out.

Any method you can think of please share, here is some sample data to work with.

I am going to attack in the morning, but I can def use some help in tackling this, it's a tricky one.


select *
into #TMP
from
(
Select 33 as groupingid, 100000725863771 as personID, 1 as tallestinrank Union all
Select 33, 41601546, 2 Union all
Select 33, 605336931, 3 Union all
Select 35, 655428598, 2 Union all
Select 35, 1264828775, 1 Union all
Select 35, 500193282, 3 Union all
Select 36, 605336931, 3 Union all
Select 36, 1499985186, 2 Union all
Select 36, 1463627840, 1 Union all
Select 38, 720730362, 3 Union all
Select 38, 713151208, 1 Union all
Select 38, 596476180, 2 Union all
Select 40, 1183632078, 3 Union all
Select 40, 590096420, 2 Union all
Select 40, 622417, 1 Union all
Select 42, 1430446161, 2 Union all
Select 42, 507058762, 1 Union all
Select 42, 1278733128, 3 Union all
Select 44, 1206490880, 3 Union all
Select 44, 1463627840, 1 Union all
Select 44, 1490754486, 2 Union all
Select 46, 100000058324851, 1 Union all
Select 46, 520640005, 3 Union all
Select 46, 554138433, 2 Union all
Select 48, 16813024, 2 Union all
Select 48, 652618967, 3 Union all
Select 48, 503883, 1 Union all
Select 50, 652618967, 3 Union all
Select 50, 1231428362, 2 Union all
Select 50, 537016020, 1 Union all
Select 52, 1231428362, 2 Union all
Select 52, 1293862037, 1 Union all
Select 52, 100000153324211, 3 Union all
Select 54, 5618049, 1 Union all
Select 54, 1738012857, 2 Union all
Select 54, 100000066863896, 3 Union all
Select 56, 100000172993707, 3 Union all
Select 56, 1503156516, 1 Union all
Select 56, 1463627840, 2 Union all
Select 58, 100000562156151, 2 Union all
Select 58, 520640005, 3 Union all
Select 58, 194301031, 1
) a





EDIT:
It's a tough one!! Again, if anyone is up for testing their knowledge, I can appreciate the help! I have one method right now I am going to try, but not sure how efficient it will be.


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-21 : 14:58:26
A little more information Basically to simplify I want a results table with every personid and each name I can definitively rank above, below, or undetermined. Compared can have a value of 1,2,3 (1 shorter 2 taller 3 undetermined)

Create table #test(Personid float,personidcompare float,compared int)

I do not need to use this approach, basically I am looking for anyway to definitively know who ranks the tallest to the shortest based on the small group definitive s given in the sample data.


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-09-21 : 16:17:26
This is impossible without knowing the heights.

There's no way to know if the tallest person in group 33 is shorter or taller than a person in group 35.
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-21 : 16:22:06
Russell, it is only impossible for only SOME records, not ALL. I am trying to find the ones where it is not impossible.

for instance

GROUP 33 John is taller than jack and jack is taller than jill

Group 35 Jack is taller than Mike and Mike is taller than peter

I know definitively John is taller than Mike, Peter,Jack and Jill.

This is what I want to find out but against thousands of names. I need to extract the names I know definitively and then rank them for each individual (So each person I can tell the entire lineage of who they are taller than). I think I have it figured out not impossible, but tricky.

I can use a hand though if you want to take a crack at it.

Thanks


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2011-09-21 : 19:50:46
I am hoping another will come along with a fancy recursive CTE or recursive table udf for this, but here is an old style nested loop which gives all children for each parent. It is also not optimized in any way;

--create new container
SELECT
IDENTITY(int, 1,1) AS pk,
groupingid AS grp,
personID AS id,
tallestinrank AS [rank]
INTO #t
FROM #tmp

--input val (searching for all ids under this val)
DECLARE @id BIGINT;

--local vars
DECLARE @ct INT = 1;
DECLARE @oCt INT = @ct;
DECLARE @maxCt int; SELECT @maxCt = COUNT(id) FROM #t

--results container
CREATE TABLE #r (pk INT IDENTITY(1,1), nest INT, parentID BIGINT, id BIGINT, grp INT, [rank] INT)

WHILE (@oCt <= @maxCt)
BEGIN

--get the next personID as parent
SELECT @id = id FROM #t WHERE pk = @oCt

WHILE (@ct <= @maxCt)
BEGIN

IF (@ct = 1)
BEGIN

--init table vals with 1st set of siblings
INSERT #r (nest, parentID, id, grp, [rank])
SELECT @oCt, @id, t2.id, t2.grp, t2.[rank]
FROM #t t1
JOIN #t t2 ON t2.grp = t1.grp
WHERE t1.id = @id AND t1.id != t2.id AND t1.rank <= t2.rank

END

--get all ids which rank lower than the ones in #r for this nest
INSERT #r (nest, parentID, id, grp, rank)
SELECT DISTINCT @oCt, @id, t.id, t.grp, t.rank
FROM #t t
JOIN ( SELECT id, grp, rank FROM #t WHERE id IN (SELECT id FROM #r WHERE nest=@oCt) ) g ON g.grp = t.grp
WHERE g.rank < t.rank

--increment inner looop counter
SET @ct+=1

END

--increment outer loop counter
SELECT @oCt +=1, @ct=1

END

--results
SELECT DISTINCT parentID, id
FROM #r
ORDER BY parentID

--clean up
DROP TABLE #t, #r


Hope to see some other solutions.

Have a nice evening OP.
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-21 : 21:02:20
Thanks, that was the initial approach I was thinking, but I have a solution I am working on that I believe will be a large improvement on performance. Once I have tested I will post tomorrow. I am trying to write the first version without a CTE, so it is DB independent, but I will try to utilize a CTE or possibly remove a loop/recursive all together and only have the loop to do another run on the update query and not like it is now on each individual item (This method I am not 100% confident it will work as of yet, but I am not ruling it out until I test a little more). Either way I will post whatever I can come up with tomorrow, but I have 1 method for sure that I know will be a performance boost to this method.

Thanks again, although I have not gone through your results yet to ensure they are accurate, it appears to be from first glance, and I really appreciate you posting this!

I too would also like to see other approaches, so whoever thinks they can improve please attempt as well!


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2011-09-21 : 21:21:06
quote:
Originally posted by Vinnie881

Thanks, that was the initial approach I was thinking, but I have a solution I am working on that I believe will be a large improvement on performance. Once I have tested I will post tomorrow. I am trying to write the first version without a CTE, so it is DB independent, but I will try to utilize a CTE or possibly remove a loop/recursive all together and only have the loop to do another run on the update query and not like it is now on each individual item (This method I am not 100% confident it will work as of yet, but I am not ruling it out until I test a little more). Either way I will post whatever I can come up with tomorrow, but I have 1 method for sure that I know will be a performance boost to this method.

Thanks again, although I have not gone through your results yet to ensure they are accurate, it appears to be from first glance, and I really appreciate you posting this!

I too would also like to see other approaches, so whoever thinks they can improve please attempt as well!


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881



Sounds great. My solution is partial and needs some refinement. It serves as an idea more than a finished solution. I am sure there are much more efficient methods.

Best wishes.
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2011-09-21 : 23:05:41
OK, I could not sleep until I rectified the poor 1st pass with the loop... lol...

So here is an option of a set-based approach using a udf. I believe this approach would meet your criteria (set-based and portable to ANSI). I am sure there are yet better approaches, but I am rubbing my old eyes this evening.


--create new base table container for udf use
SELECT
groupingid AS grp,
personID AS id,
tallestinrank AS [rank]
INTO dbo.t
FROM #tmp
GO

--create function for recursion
CREATE FUNCTION dbo.udfGetSiblings
(
@id bigint
)
RETURNS TABLE
AS
RETURN
(
SELECT DISTINCT t.id, t.grp, t.rank
FROM dbo.t t
JOIN
(
SELECT id, grp, rank
FROM dbo.t
WHERE id=@id
) d ON d.grp = t.grp AND t.rank > CASE WHEN t.rank = 3 THEN 2 ELSE d.rank END

)
GO

--results
SELECT DISTINCT
d.parentID,
d.id AS siblingID
(
--1st level
SELECT a.id parentID, b.id, b.grp
FROM dbo.t a
CROSS APPLY dbo.udfGetSiblings(a.id) b

UNION

--2nd level
SELECT t.parentID, u.id, u.grp
FROM
(
SELECT DISTINCT a.id parentID, b.id, b.grp
FROM dbo.t a
CROSS APPLY dbo.udfGetSiblings(a.id) b
) t
CROSS APPLY dbo.udfGetSiblings(t.id) u
) d
ORDER BY d.parentid
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-22 : 01:02:50
I just did a check and it does not work. Here is a new sample set using the udf to illustrate the issue.


--drop table #tmp
select *
into #TMP
from
(
Select 33 as groupingid, 11 as personID, 1 as tallestinrank Union all
Select 33, 22, 2 Union all
Select 33, 33, 3 Union all
Select 35, 11, 2 Union all
Select 35, 77, 1 Union all
Select 35, 88, 3 Union all
Select 36, 33333, 3 Union all
Select 36, 22222, 2 Union all
Select 36, 88, 1 Union all
Select 38, 99999999, 3 Union all
Select 38, 22222, 1 Union all
Select 38, 77777777, 2 Union all
Select 40, 87676623, 3 Union all
Select 40, 88, 1 Union all
Select 40, 1234567, 2
) a
/*
id 11 is taller than
22
33
88
22222
33333
99999999 --since 22222 is in the same group and marked with 1 as a rank and we know 11 is higher than 22222, we also know this is lower than 11
77777777 --since 22222 is in the same group and marked with 1 as a rank and we know 11 is higher than 22222, we also know this is lower than 11
1234567
87676623
*/
SELECT
groupingid AS grp,
personID AS id,
tallestinrank AS [rank]
INTO dbo.t
FROM #tmp
GO

--create function for recursion
--drop function dbo.udfGetSiblings
/*
CREATE FUNCTION dbo.udfGetSiblings
(
@id bigint
)
RETURNS TABLE
AS
RETURN
(
SELECT DISTINCT t.id, t.grp, t.rank
FROM dbo.t t
JOIN
(
SELECT id, grp, rank
FROM dbo.t
WHERE id=@id
) d ON d.grp = t.grp AND t.rank > CASE WHEN t.rank = 3 THEN 2 ELSE d.rank END

)
GO
*/
--results
SELECT DISTINCT
d.parentID,
d.id AS siblingID
from
(
--1st level
SELECT a.id parentid, b.id, b.grp
FROM dbo.t a
CROSS APPLY dbo.udfGetSiblings(a.id) b

UNION

--2nd level
SELECT t.parentID, u.id, u.grp
FROM
(
SELECT DISTINCT a.id parentID, b.id, b.grp
FROM dbo.t a
CROSS APPLY dbo.udfGetSiblings(a.id) b
) t
CROSS APPLY dbo.udfGetSiblings(t.id) u
) d
ORDER BY d.parentid

drop table #tmp,t



Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2011-09-22 : 08:02:43
Ah. I see.

SELECT DISTINCT 
d.parentID,
d.id AS siblingID
FROM
(
--last level
SELECT d3.parentID, u.id, u.grp
FROM
(
--3rd level
SELECT d2.parentID, u.id, u.grp
FROM
(
--2st level
SELECT d1.parentID, u.id, u.grp
FROM
(
--1st level
SELECT DISTINCT a.id parentID, b.id, b.grp
FROM dbo.t a
CROSS APPLY dbo.udfGetSiblings(a.id) b
) d1
CROSS APPLY dbo.udfGetSiblings(d1.id) u
) d2
CROSS APPLY dbo.udfGetSiblings(d2.id) u
) d3
CROSS APPLY dbo.udfGetSiblings(d3.id) u
) d
WHERE parentid!=id
ORDER BY d.parentid


Sadly, math was never my strong suit and this is a maths problem; How many max(iterations) are required? Answer that and one can derive an optimum solution. Since I am not so good at math, I just keep adding recursion depths (I believe two more get us from 1st to last in this 3 rank tree)... But I could be wrong. As some say, "guessing is no way to solve a problem". I keep on guessing... lol...

Have a good day OP.

EDIT: Added more depths to reach end. Cleaned up a bit. I think it is rather hack though. Would be nice to see a set-based solution to deal with n ranks.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-09-22 : 09:12:42
Vinnie, haven't we done this before? As connected subgraphs?

(33) 100000725863771 -> 41601546 -> 605336931
(35) 1264828775 -> 655428598 -> 500193282
(36) 1463627840 -> 1499985186 -> 605336931

As you now can see, personid 605336931 exist in both (33) and (36). However we cannot say if 41601546 is taller than 1499985186, or vice versa.

100000725863771 -> 41601546 ->+ 605336931
1463627840 -> 1499985186 ->|

1264828775 -> 655428598 -> 500193282

I think this can be easy solved be normalizing the data to a two-column table {TallerPersonID, ShorterPersonID}.

100000725863771, 41601546
100000725863771, 605336931
41601546, 605336931
1264828775, 655428598
1264828775, 500193282
655428598, 500193282
1463627840, 1499985186
1463627840, 605336931
1499985186, 605336931

Now you easily can traverse the "tree" !


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2011-09-22 : 09:53:16
quote:
Originally posted by SwePeso

Vinnie, haven't we done this before? As connected subgraphs?


N 56°04'39.26"
E 12°55'05.63"




Thanks, I will search for such a discussion.

Best wishes.

EDIT. Yes, a variant of expanding networks is the topic here as well. Your solution will work (also for n level trees). And looks fast for large sets.

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=89323&whichpage=1

Thanks again for your contributions.

Have a nice day guys.
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-22 : 10:02:00
quote:

Vinnie, haven't we done this before? As connected subgraphs?


I don't believe I have ever done an issue like this before, but who knows my memory matches that of a goldfish:). I'll do some searching of the forums though for what you mentioned, thanks!

I tried a method, of converting to a two column parent/child table, and doing a standard hierarchy on it, but i run into issues where I as of right now am unsuccessfully able to figure out a way where I can easily query the results to get a answer of everyone where it's possible to know the answer. I can only successfully get each tree in a column.

Thanks everyone for your posts!


Edit: I just saw that link ehorn put up, Awsome, Awsome, Awsome!!!

Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-09-22 : 18:12:33
Here is a function to determine if first person is taller than second person.
CREATE FUNCTION dbo.fnIsTaller
(
@TallerPersonID BIGINT,
@ShorterPersonID BIGINT
)
RETURNS BIT
AS
BEGIN
DECLARE @IsTaller BIT

;WITH cteTree(PersonID, TallRank, TallPath)
AS (
SELECT d.PersonID,
CAST(1 AS TINYINT) AS TallRank,
CAST('\' + CAST(d.PersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath
FROM (
SELECT PersonID
FROM dbo.Sampledata
WHERE TallRank = 1
GROUP BY PersonID
) AS d
WHERE NOT EXISTS (SELECT * FROM dbo.Sampledata AS w WHERE w.PersonID = d.PersonID AND w.TallRank > 1)

UNION ALL

SELECT b.PersonID,
CAST(b.TallRank AS TINYINT) AS TallRank,
CAST(t.TallPath + '\' + CAST(b.PersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath
FROM cteTree AS t
INNER JOIN dbo.Sampledata AS a ON a.PersonID = t.PersonID
AND a.TallRank = t.TallRank
INNER JOIN dbo.Sampledata AS b ON b.GroupingID = a.GroupingID
AND b.TallRank = a.TallRank + 1
)
SELECT @IsTaller = COUNT(*)
FROM cteTree
WHERE TallPath LIKE '%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%'

RETURN @IsTaller
END



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-09-22 : 18:59:45
Oh well... Here is a complete and fast solution.
CREATE FUNCTION dbo.fnIsTaller
(
@TallerPersonID BIGINT,
@ShorterPersonID BIGINT
)
RETURNS BIT
AS
BEGIN
DECLARE @Connected INT

DECLARE @Map TABLE
(
TallerPersonID BIGINT NOT NULL,
ShorterPersonID BIGINT NOT NULL,
PRIMARY KEY CLUSTERED
(
TallerPersonID,
ShorterPersonID
),
IsReverse TINYINT NOT NULL
)

INSERT @Map
(
TallerPersonID,
ShorterPersonID,
IsReverse
)
SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 2
WHERE t.TallRank = 1

UNION

SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 3
WHERE t.TallRank = 1

UNION

SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 3
WHERE t.TallRank = 2

UNION

SELECT PersonID,
PersonID,
2 AS IsReverse
FROM dbo.SampleData
WHERE TallRank = 1

INSERT @Map
(
TallerPersonID,
ShorterPersonID,
IsReverse
)
SELECT ShorterPersonID,
TallerPersonID,
1 AS IsReverse
FROM @Map

EXCEPT

SELECT TallerPersonID,
ShorterPersonID,
1 AS IsReverse
FROM @Map

WHILE @@ROWCOUNT > 0
INSERT @Map
(
TallerPersonID,
ShorterPersonID,
IsReverse
)
SELECT DISTINCT
m1.TallerPersonID,
m2.ShorterPersonID,
3 AS IsReverse
FROM @Map AS m1
INNER JOIN @Map AS m2 ON m2.TallerPersonID = m1.ShorterPersonID
WHERE NOT EXISTS (
SELECT *
FROM @Map AS x
WHERE x.TallerPersonID = m1.TallerPersonID
AND x.ShorterPersonID = m2.ShorterPersonID
)

;WITH cteConnected(PersonID, theGroup)
AS (
SELECT TallerPersonID,
MIN(ShorterPersonID)
FROM @Map
GROUP BY TallerPersonID
)
SELECT @Connected = COUNT(*)
FROM cteConnected
WHERE PersonID IN (@TallerPersonID, @ShorterPersonID)
GROUP BY theGroup
HAVING COUNT(*) = 2

IF @Connected IS NULL
RETURN NULL
ELSE
DELETE
FROM @Map
WHERE IsReverse > 0

;WITH cteTree(PersonID, TallPath)
AS (
SELECT m.ShorterPersonID AS PersonID,
CAST('\' + CAST(m.TallerPersonID AS VARCHAR(32)) + '\\' + CAST(m.ShorterPersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath
FROM @Map AS m
WHERE NOT EXISTS (SELECT * FROM @Map AS w WHERE w.ShorterPersonID = m.TallerPersonID)

UNION ALL

SELECT m.ShorterPersonID,
CAST(t.TallPath + '\' + CAST(m.ShorterPersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath
FROM cteTree AS t
INNER JOIN @Map AS m ON m.TallerPersonID = t.PersonID
)
SELECT @Connected = SIGN(COUNT(*))
FROM cteTree
WHERE TallPath LIKE '%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%'

RETURN @Connected
END
GO

SELECT dbo.fnIsTaller(507058762, 1278733128)
-- NULL No solution possible
-- 1 TallerPersonID is taller than ShorterPersonID
-- 0 TallerPersonID is shorter than ShorterPersonID



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-22 : 22:24:30
SwePeso, if T-Sql was baseball I'd request you were tested for steroids! It's not normal how good you are!


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-09-23 : 06:45:33
Thank you.

Does the function work out as yout thought? And fast enough?


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-23 : 15:10:48
Yes, it's fast, correct, and flipping GREAT!! The only modifications I did to it were minor tweaks to better use in my exact scenario(Having the table data rather than a bit was a little better for me), but as far as the method, and queries go it is perfect!!!!

Again thanks!

There is one small issue not with the query, but with the data. If there is a entry error with data, and one hierarchy shows the person as taller, yet another shows it as smaller, the function then does an inifinite loop.

It's a easy fix since all I need to do is run the function on the values before inserting them, but is that the best approach?

Ideally I'd like a way where the function could identify this as value 3 in the taller column returned and didn't do an endless loop/recursion in those scenarios, but again it's minor since I can just run the function on the data prior to inserting each set and correct before it ever reaches a table.

Thanks.

here is a sample that will cause a endless loop. The value 9999999 is both higher and lower than 11.


drop table SampleData
select *
into SampleData
from
(
Select 33 as groupingid, 11 as personID, 1 as tallrank Union all
Select 33, 22, 2 Union all
Select 33, 33, 3 Union all
Select 35, 11, 2 Union all
Select 35, 77, 1 Union all
Select 35, 88, 3 Union all
Select 36, 33333, 3 Union all
Select 36, 22222, 2 Union all
Select 36, 88, 1 Union all
Select 38, 99999999, 3 Union all
Select 38, 22222, 1 Union all
Select 38, 77777777, 2 Union all
Select 39, 99999999, 2 Union all
Select 39, 22222994, 1 Union all
Select 39, 98765432, 3 Union all
Select 40, 87676623, 3 Union all
Select 40, 88, 1 Union all
Select 40, 1234567, 2

Union all
Select 42, 99999999, 1 Union all
Select 42, 11, 2 Union all
Select 42, 13219323, 3

) a
go

drop function dbo.fnIsTallertable
go
CREATE FUNCTION dbo.fnIsTallertable
(
@TallerPersonID BIGINT,
@ShorterPersonID BIGINT
)
RETURNS @t table (PersonID int, TallerPath varchar(max),taller int,depth int)
AS
BEGIN
DECLARE @Connected INT

DECLARE @Map TABLE
(
TallerPersonID BIGINT NOT NULL,
ShorterPersonID BIGINT NOT NULL,
PRIMARY KEY CLUSTERED
(
TallerPersonID,
ShorterPersonID
),
IsReverse TINYINT NOT NULL
)

INSERT @Map
(
TallerPersonID,
ShorterPersonID,
IsReverse
)
SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 2
WHERE t.TallRank = 1

UNION

SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 3
WHERE t.TallRank = 1

UNION

SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 3
WHERE t.TallRank = 2

UNION

SELECT PersonID,
PersonID,
2 AS IsReverse
FROM dbo.SampleData
WHERE TallRank = 1

INSERT @Map
(
TallerPersonID,
ShorterPersonID,
IsReverse
)
SELECT ShorterPersonID,
TallerPersonID,
1 AS IsReverse
FROM @Map

EXCEPT

SELECT TallerPersonID,
ShorterPersonID,
1 AS IsReverse
FROM @Map

WHILE @@ROWCOUNT > 0
INSERT @Map
(
TallerPersonID,
ShorterPersonID,
IsReverse
)
SELECT DISTINCT
m1.TallerPersonID,
m2.ShorterPersonID,
3 AS IsReverse
FROM @Map AS m1
INNER JOIN @Map AS m2 ON m2.TallerPersonID = m1.ShorterPersonID
WHERE NOT EXISTS (
SELECT *
FROM @Map AS x
WHERE x.TallerPersonID = m1.TallerPersonID
AND x.ShorterPersonID = m2.ShorterPersonID
)

;WITH cteConnected(PersonID, theGroup)
AS (
SELECT TallerPersonID,
MIN(ShorterPersonID)
FROM @Map
GROUP BY TallerPersonID
)
SELECT @Connected = COUNT(*)
FROM cteConnected
WHERE PersonID IN (@TallerPersonID, @ShorterPersonID)
GROUP BY theGroup
HAVING COUNT(*) = 2

IF @Connected IS NULL
RETURN
ELSE
DELETE
FROM @Map
WHERE IsReverse > 0

;WITH cteTree(PersonID, TallPath,depth)
AS (
SELECT m.ShorterPersonID AS PersonID,
CAST('\' + CAST(m.TallerPersonID AS VARCHAR(32)) + '\\' + CAST(m.ShorterPersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath

,0
FROM @Map AS m
WHERE NOT EXISTS (SELECT * FROM @Map AS w WHERE w.ShorterPersonID = m.TallerPersonID)

UNION ALL

SELECT m.ShorterPersonID,
CAST(t.TallPath + '\' + CAST(m.ShorterPersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath
,depth + 1
FROM cteTree AS t
INNER JOIN @Map AS m ON m.TallerPersonID = t.PersonID
)
insert into @t(Personid,tallerpath,taller ,depth )
SELECT PersonID,TallPath ,
case when TallPath LIKE '%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%' then 1
else 0 end
,depth
FROM cteTree
WHERE TallPath LIKE '%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%'
or TallPath LIKE '%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%'

return
END

GO
--T * from dbo.fnIsTallertable(77, 11)
SELECT * from dbo.fnIsTallertable(11,99999999 )




Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-25 : 20:02:35
One more thing, I can not figure out what the last Union does? Can you please explain why this is needed?


UNION
SELECT PersonID,
PersonID,
2 AS IsReverse
FROM dbo.SampleData
WHERE TallRank = 1



Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-09-26 : 03:11:55
It is to make sure that there is a two-way relationsship. It is necessary to get the correct result for tallness comparison.
If there is not relationsship at all between the two persons, the function returns NULL which means it is not possible to tell because there isn't enough information.
Then we remove all "two-way" and reverse information to tell if the taller person really is taller than the shorter person. Or not.


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2011-09-26 : 14:15:39
Thanks!

I made a couple modifications that are giving me slightly better performance on my data by eliminating that first check and only do so after the results are retrieved.
1.This should show a link in either direction
2.The taller column will show a 1 if it's taller
3.if nothing is returned there is no link.

I can see this query lagging on certain sets of data, but when I run in sample scenarios it is outperforming than when I include the check for the linking data prior to the loop from all the extra steps that are required to do a quick check.

Thanks again for all the help!


drop table dbo.SampleData
select *
into dbo.SampleData
from
(
Select 33 as groupingid, 11 as personID, 1 as tallrank Union all
Select 33, 22, 2 Union all
Select 33, 33, 3 Union all
Select 35, 11, 2 Union all
Select 35, 77, 1 Union all
Select 35, 88, 3 Union all
Select 36, 33333, 3 Union all
Select 36, 22222, 2 Union all
Select 36, 88, 1 Union all
Select 99, 99999999,1 Union all
Select 99, 11, 2 Union all
Select 99, 1454545454, 3 Union all
Select 40, 87676623, 3 Union all
Select 40, 88, 1 Union all
Select 40, 1234567, 2

) a
go


drop function dbo.fnIsTallertable
go


go
CREATE FUNCTION dbo.fnIsTallertable
(
@TallerPersonID BIGINT,
@ShorterPersonID BIGINT
)
RETURNS @t table (PersonID bigint, personid2 bigint,TallPath varchar(max),taller int,cost int)
AS
BEGIN
DECLARE @Connected INT

DECLARE @Map TABLE
(
TallerPersonID BIGINT NOT NULL,
ShorterPersonID BIGINT NOT NULL,
PRIMARY KEY CLUSTERED
(
TallerPersonID,
ShorterPersonID
),
IsReverse TINYINT NOT NULL
)

INSERT @Map
(
TallerPersonID,
ShorterPersonID,
IsReverse
)
SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 2
WHERE t.TallRank = 1

UNION

SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 3
WHERE t.TallRank = 1

UNION

SELECT t.PersonID,
s.PersonID,
0 AS IsReverse
FROM dbo.SampleData AS t
INNER JOIN dbo.SampleData AS s ON s.GroupingID = t.GroupingID
AND s.TallRank = 3
WHERE t.TallRank = 2

;WITH cteTree(PersonID, personid2,TallPath,taller,cost)
AS (
SELECT m.ShorterPersonID AS PersonID
,m.tallerpersonid
,CAST('\' + CAST(m.TallerPersonID AS VARCHAR(32)) + '\\' + CAST(m.ShorterPersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath
,1
,0
FROM @Map AS m
WHERE NOT EXISTS (SELECT * FROM @Map AS w WHERE w.ShorterPersonID = m.TallerPersonID)
and m.Tallerpersonid in( @tallerPersonid,@shorterpersonid)

UNION ALL

SELECT m.ShorterPersonID
,m.tallerpersonid
,CAST(t.TallPath + '\' + CAST(m.ShorterPersonID AS VARCHAR(32)) + '\' AS VARCHAR(MAX)) AS TallPath
,1
,t.cost + 1
FROM cteTree AS t
INNER JOIN @Map AS m ON m.TallerPersonID = t.PersonID

)
insert into @t(PersonID, personid2,TallPath,taller,cost)
SELECT PersonID,personid2,TallPath,case when TallPath LIKE '%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%' then 1 else 0 end
,cost
FROM cteTree
WHERE TallPath LIKE '%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%'
or TallPath LIKE '%\' + CAST(@ShorterPersonID AS VARCHAR(32)) + '\%\' + CAST(@TallerPersonID AS VARCHAR(32)) + '\%'
option(maxrecursion 100)
return
END
GO
SELECT * from dbo.fnIsTallertable(99999999,33333)



Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page
    Next Page

- Advertisement -