Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Extracting Circular Relationships from a table

Author  Topic 

mchandrav
Starting Member

7 Posts

Posted - 2011-05-24 : 14:36:54
Hi All

I am working on a SQL query to extract relationships such as A --> B --> C from a table and need help with the logic.

General example:

Table 1 stores application names with unique identifiers

Record # Name GUID
1 A1.Name A1.GUID
2 A2.Name A2.GUID
3 A3.Name A3.GUID
4 A4.Name A4.GUID
5 A5.Name A5.GUID

Table 2 stores relations between two applications using GUIDs

Record # Source Destination
1 A1.GUID A2.GUID
2 A2.GUID A3.GUID
3 A3.GUID A1.GUID
4 A2.GUID A3.GUID

Now, we need a SQL Query that can extract the data in the following format.

Record # Source Destination SourceGUID Destination GUID
1 A1.Name A2.Name A1.GUID A2.GUID
2 A2.Name A3.Name A2.GUID A3.GUID
3 A3.Name A1.Name A3.GUID A1.GUID

Any help is greatly appreciated.

Thanks in advance
Mahesh

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-24 : 15:27:51
Don't know if the fact that there are circular relationships in the logic matters, but would this work for you?

SELECT
t2.RecordNum,
t1a.Name AS Source,
t1b.Name AS Destination,
t2.Source AS SourceGUID,
t2.Destination AS DestinationGUID
FROM
Table2 t2
INNER JOIN Table1 t1A
ON t1A.GUID = t2.Source
INNER JOIN Table1 t1B
ON t1B.GUID = t2.Destination
Go to Top of Page

mchandrav
Starting Member

7 Posts

Posted - 2011-05-24 : 18:59:23
Hi Sunita

Thanks for the reply. The above query is not going to work as it returns all rows.

Mahesh
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-24 : 20:30:04
Ah, I see what you meant by circular relationships. If it were guaranteed that there would be only one path from beginning to end, it would have been relatively simple. But the way you have it where there are multiple paths, I can't think of an easy way to do this.

Even multiple paths would have been easy enough, but if there are branches going off - for example, if you insert additional rows into your table2 like the 3 shown below, I can't think of a reliable way to identify the loops

5 A2.GUID A8.GUID
6 A8.GUID A9.GUID
5 A9.GUID A0.GUID
Are you able to specify any constraints on the paths that can exist?
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-05-25 : 03:50:37
Are you saying that the relationship A1.GUID-A2.GUID is the same as A2.GUID-A1.GUID and you only want distinct relationships?

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-05-25 : 11:18:07
I suspect that your description is over simplified, so perhaps more sample data is required. But expanding on Sunitabeck's example, you could just use a ranking function to get the expected output you posted. This is un tested as the sampe data is not in a format that I can easily consume, but maybe it'll get you going:
SELECT
t2.RecordNum,
t1a.Name AS Source,
t1b.Name AS Destination,
t2.Source AS SourceGUID,
t2.Destination AS DestinationGUID
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Source, Destination ORDER BY RecordNum) AS RowNum
FROM
Table2
) AS t2
INNER JOIN Table1 t1A
ON t1A.GUID = t2.Source
INNER JOIN Table1 t1B
ON t1B.GUID = t2.Destination
WHERE
RowNum = 1
Go to Top of Page

mchandrav
Starting Member

7 Posts

Posted - 2011-05-25 : 12:25:52
Firstly, Thanks to everyone for your feedback and valuable input.

Each row in the relationship table stores relations between two items (Example: A - B, B - C, C- D, D - E, E - F). Now we need a query that would extract relations which are circular. example A - B, B - C and C - A are circular. We have a utility that leverages this data and that simply crashes as it loops over and over because of these circular relations.

After reading more about SQL Server, I came to know that this is possible using recursive CTE but cannot get it working.
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-05-25 : 12:34:02
You might want to search on Circular Refernce to see what you find. The basic gist is you need to create a path (Materialized Path? not 100% sure if thats the term). Then you can use that path to determine if you are looping back on yourself.

If I have time today I might be able to code something up, but here is a link that might help:
http://www.sqlservercentral.com/articles/CTE/72192/
Go to Top of Page

mchandrav
Starting Member

7 Posts

Posted - 2011-05-26 : 16:29:44
I tried the query mentioned in the below URL but the results seem to be way too different than what I want.

http://www.sqlservercentral.com/articles/CTE/72192/

Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-05-26 : 17:51:43
I don't think your data is modeled correctly. But, here is some code based on the article I provided above that traverses the "hierarchy" and identifies circular relations:
DECLARE @Table2 TABLE (RecordNum INT, Source VARCHAR(10), Destination VARCHAR(10))

INSERT @Table2 (RecordNum, Source, Destination)
VALUES
(1, 'A1.GUID', 'A2.GUID'),
(2, 'A2.GUID', 'A3.GUID'),
(3, 'A3.GUID', 'A1.GUID'),
(4, 'A2.GUID', 'A3.GUID'),
(5, 'A3.GUID', 'A4.GUID'),
(6, 'A3.GUID', 'A2.GUID')
;
WITH Cte
AS
(
-- Anchor member
SELECT
RecordNum, Source, Destination, CAST(Source + '\' + Destination AS VARCHAR(MAX)) AS HierarchyPath, 0 AS Level
FROM
@Table2 AS A
WHERE
RecordNum = 1

UNION ALL

-- Recursive member
SELECT
A.RecordNum, A.Source, A.Destination, CAST(B.HierarchyPath + '\' + A.Destination AS VARCHAR(MAX)), Level + 1
FROM
@Table2 AS A
INNER JOIN
Cte AS B
ON A.Source = B.Destination
WHERE
B.HierarchyPath NOT LIKE '%' + A.Source + '\%'
)

SELECT
*,
CASE
WHEN HierarchyPath LIKE '%' + Destination + '\%'
THEN 1
ELSE 0
END AS CircularReference
FROM Cte
Probably not very helpfull as there is no real way to identify the Anchor members to start the hierarchy other than hard-coding it.
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-26 : 17:54:47
I tried to work on it when you posted this, but then moved on to other things. This is not complete, but it does identify the correct loops. It identifies them multiple times - as many times as there are nodes. All of that can be fixed and the output generated in the form you asked for.

This would have been much simpler if some constraints could be specified, for example, that there would be no parallel paths, or that there would not be duplicate source-destination pairs.

In any case, here is my half-baked attempt. You can copy the code and run it to see what it is doing.
create table dbo.#tmpTable1(record int, [name] varchar(32), [guid] varchar(32));
create table dbo.#tmpTable2(record int, [source] varchar(32), [destination] varchar(32));

insert into #tmpTable1 values ('1',' A1.Name',' A1.GUID');
insert into #tmpTable1 values ('2',' A2.Name',' A2.GUID');
insert into #tmpTable1 values ('3',' A3.Name',' A3.GUID');
insert into #tmpTable1 values ('4',' A4.Name',' A4.GUID');
insert into #tmpTable1 values ('5',' A5.Name',' A5.GUID');
insert into #tmpTable1 values ('6',' A8.Name',' A8.GUID');
insert into #tmpTable1 values ('7',' A9.Name',' A9.GUID');
insert into #tmpTable1 values ('8',' A0.Name',' A0.GUID');
insert into #tmpTable1 values ('9',' A11.Name',' A11.GUID');
insert into #tmpTable1 values ('10',' A12.Name',' A12.GUID');


insert into #tmpTable2 values ('1',' A1.GUID',' A2.GUID');
insert into #tmpTable2 values ('2',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('3',' A3.GUID',' A1.GUID');
insert into #tmpTable2 values ('4',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('6',' A2.GUID',' A8.GUID');
insert into #tmpTable2 values ('7',' A8.GUID',' A9.GUID');
insert into #tmpTable2 values ('8',' A9.GUID',' A0.GUID');

-- A parallel path. If such things could be ruled out, the code
-- could have been much simpler.
insert into #tmpTable2 values ('9',' A2.GUID',' A11.GUID');
insert into #tmpTable2 values ('10',' A11.GUID',' A12.GUID');
insert into #tmpTable2 values ('11',' A12.GUID',' A3.GUID');


with cte1 AS -- cte1: remove all duplicates
(
select
min(record) as record,
[source],
[destination]
from
#tmpTable2
group by
[source],[destination]
),
cte2(record,gstart,gend,cstart,found,pathid,path) as -- cte2: find all paths
(
select
record,
[source],
[destination],
[source],
0,
row_number() over( order by (select null)),
cast(record as varchar(max))
from
cte1

union all

select
t2.record,
t2.[source],
t2.[destination],
c1.cstart,
case when c1.cstart = t2.destination then 1 else 0 end,
c1.pathid,
cast(t2.record as varchar(31))+':'+c1.path +
case when c1.cstart = t2.destination then ':'+cast(t2.record as varchar(31)) else '' END
--cast(t2.record as varchar(31))+':'+c1.path
from
cte1 t2
inner join cte2 c1 on
c1.gend = t2.[source]
where
c1.found = 0
and ':'+path+':' not like '%:'+cast(t2.record as varchar(31))+':%'
),
cte3 as -- pick only closed loops
(
select
*
from
cte2 a
where exists
(
select *
from cte2 b
where b.found = 1 and b.path like '%'+a.path+'%'
)
)
,
cte4 as -- pick distinct paths
(
select *
from cte3 ca
where not exists (select * from cte3 cb where cb.path LIKE ca.path+'%' AND ca.path <> cb.path)
)
select path from cte4 -- but there are still more duplicate paths, which can be eliminated.

drop table #tmpTable1;
drop table #tmpTable2;
Go to Top of Page

mchandrav
Starting Member

7 Posts

Posted - 2011-05-27 : 02:28:39
Hi Sunitha and Lamprey

I really thank you for your valuable input.

Lamprey - I am still playing around with your query as I have some syntax issues (Using SQL Server).

Sunitha - I think we are close, just need a little tweak to the query. So, this is the result of your query.

1:3:2:1
10:9:1:3:11:10
3:2:1:3
3:11:10:9:1:3
9:1:3:11:10:9
1:3:11:10:9:1
2:1:3:2
11:10:9:1:3:11

It pulls the record int from the relations table. Is it possible the get the results in this way.

1. A1.Name --> A2.Name : A3.Name --> A1.Name : A2.Name --> A3.Name : A1.Name --> A2.Name
2. A11.Name --> A12.Name : A2.Name --> A11.Name : A1.Name --> A2.Name : A3.Name --> A1.Name : A12.Name --> A2.Name : A11.Name --> A12.Name

Basically interpreting the records from tmptable2 with the actual names stored in tmptable1.

Thanks
Mahesh
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-27 : 07:40:07
If you are not concerned about the duplicate paths, getting the relationships as names instead of id's is only a minor change. See below.

create table dbo.#tmpTable1(record int, [name] varchar(32), [guid] varchar(32));
create table dbo.#tmpTable2(record int, [source] varchar(32), [destination] varchar(32));

insert into #tmpTable1 values ('1',' A1.Name',' A1.GUID');
insert into #tmpTable1 values ('2',' A2.Name',' A2.GUID');
insert into #tmpTable1 values ('3',' A3.Name',' A3.GUID');
insert into #tmpTable1 values ('4',' A4.Name',' A4.GUID');
insert into #tmpTable1 values ('5',' A5.Name',' A5.GUID');
insert into #tmpTable1 values ('6',' A8.Name',' A8.GUID');
insert into #tmpTable1 values ('7',' A9.Name',' A9.GUID');
insert into #tmpTable1 values ('8',' A0.Name',' A0.GUID');
insert into #tmpTable1 values ('9',' A11.Name',' A11.GUID');
insert into #tmpTable1 values ('10',' A12.Name',' A12.GUID');


insert into #tmpTable2 values ('1',' A1.GUID',' A2.GUID');
insert into #tmpTable2 values ('2',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('3',' A3.GUID',' A1.GUID');
insert into #tmpTable2 values ('4',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('6',' A2.GUID',' A8.GUID');
insert into #tmpTable2 values ('7',' A8.GUID',' A9.GUID');
insert into #tmpTable2 values ('8',' A9.GUID',' A0.GUID');

-- A parallel path. If such things could be ruled out, the code
-- could have been much simpler.
insert into #tmpTable2 values ('9',' A2.GUID',' A11.GUID');
insert into #tmpTable2 values ('10',' A11.GUID',' A12.GUID');
insert into #tmpTable2 values ('11',' A12.GUID',' A3.GUID');


with cte1 AS -- cte1: remove all duplicates
(
select
min(record) as record,
[source],
[destination]
from
#tmpTable2
group by
[source],[destination]
),
cte2(record,gstart,gend,cstart,found,pathid,path) as -- cte2: find all paths
(
select
c1.record,
[source],
[destination],
[source],
0,
row_number() over( order by (select null)),
cast(t1.Name as varchar(max))
from
cte1 c1
INNER JOIN #tmpTable1 t1 ON t1.[guid] = c1.[source]

union all

select
t2.record,
t2.[source],
t2.[destination],
c1.cstart,
case when c1.cstart = t2.destination then 1 else 0 end,
c1.pathid,
cast(t1.Name as varchar(31))+'->'+c1.path +
case when c1.cstart = t2.destination then '->'+cast(t1.Name as varchar(31)) else '' END
--cast(t2.record as varchar(31))+'->'+c1.path
from
cte1 t2
inner join cte2 c1 on
c1.gend = t2.[source]
INNER JOIN #tmpTable1 t1 ON t1.[guid] = c1.gend
where
c1.found = 0
and '->'+path+'->' not like '%->'+cast(t1.Name as varchar(31))+'->%'
),
cte3 as -- pick only closed loops
(
select
*
from
cte2 a
where exists
(
select *
from cte2 b
where b.found = 1 and b.path like '%'+a.path+'%'
)
)
,
cte4 as -- pick distinct paths
(
select *
from cte3 ca
where not exists (select * from cte3 cb where cb.path LIKE ca.path+'%' AND ca.path <> cb.path)
)
select path from cte4 -- but there are still more duplicate paths, which can be eliminated.

drop table #tmpTable1;
drop table #tmpTable2;
Go to Top of Page

mchandrav
Starting Member

7 Posts

Posted - 2011-05-27 : 14:20:25
Hi Sunita

Thanks a lot. That query worked.


Regards
Mahesh
Go to Top of Page
   

- Advertisement -