Extracting Circular Relationships from a table

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2008 Forums

Transact-SQL (2008)

Extracting Circular Relationships from a table

Author

Topic

mchandrav
Starting Member

7 Posts

Posted - 2011-05-24 : 14:36:54

Hi All

I am working on a SQL query to extract relationships such as A --> B --> C from a table and need help with the logic.

General example:

Table 1 stores application names with unique identifiers

Record # Name GUID
1 A1.Name A1.GUID
2 A2.Name A2.GUID
3 A3.Name A3.GUID
4 A4.Name A4.GUID
5 A5.Name A5.GUID

Table 2 stores relations between two applications using GUIDs

Record # Source Destination
1 A1.GUID A2.GUID
2 A2.GUID A3.GUID
3 A3.GUID A1.GUID
4 A2.GUID A3.GUID

Now, we need a SQL Query that can extract the data in the following format.

Record # Source Destination SourceGUID Destination GUID
1 A1.Name A2.Name A1.GUID A2.GUID
2 A2.Name A3.Name A2.GUID A3.GUID
3 A3.Name A1.Name A3.GUID A1.GUID

Any help is greatly appreciated.

Thanks in advance
Mahesh

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-24 : 15:27:51

Don't know if the fact that there are circular relationships in the logic matters, but would this work for you?


SELECT
	t2.RecordNum,
	t1a.Name AS Source,
	t1b.Name AS Destination,
	t2.Source AS SourceGUID,
	t2.Destination AS DestinationGUID
FROM
	Table2 t2
	INNER JOIN Table1 t1A
		ON t1A.GUID = t2.Source
	INNER JOIN Table1 t1B
		ON t1B.GUID = t2.Destination

mchandrav
Starting Member

7 Posts

Posted - 2011-05-24 : 18:59:23

Hi Sunita

Thanks for the reply. The above query is not going to work as it returns all rows.

Mahesh

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-24 : 20:30:04

Ah, I see what you meant by circular relationships. If it were guaranteed that there would be only one path from beginning to end, it would have been relatively simple. But the way you have it where there are multiple paths, I can't think of an easy way to do this.

Even multiple paths would have been easy enough, but if there are branches going off - for example, if you insert additional rows into your table2 like the 3 shown below, I can't think of a reliable way to identify the loops


5   A2.GUID   A8.GUID 
6   A8.GUID   A9.GUID 
5   A9.GUID   A0.GUID

Are you able to specify any constraints on the paths that can exist?

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-05-25 : 03:50:37

Are you saying that the relationship A1.GUID-A2.GUID is the same as A2.GUID-A1.GUID and you only want distinct relationships?

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-05-25 : 11:18:07

I suspect that your description is over simplified, so perhaps more sample data is required. But expanding on Sunitabeck's example, you could just use a ranking function to get the expected output you posted. This is un tested as the sampe data is not in a format that I can easily consume, but maybe it'll get you going:

SELECT
	t2.RecordNum,
	t1a.Name AS Source,
	t1b.Name AS Destination,
	t2.Source AS SourceGUID,
	t2.Destination AS DestinationGUID
FROM
(
    SELECT 
        *, 
        ROW_NUMBER() OVER (PARTITION BY Source, Destination ORDER BY RecordNum) AS RowNum
    FROM
        Table2
) AS t2
	INNER JOIN Table1 t1A
		ON t1A.GUID = t2.Source
	INNER JOIN Table1 t1B
		ON t1B.GUID = t2.Destination
WHERE
    RowNum = 1

mchandrav
Starting Member

7 Posts

Posted - 2011-05-25 : 12:25:52

Firstly, Thanks to everyone for your feedback and valuable input.

Each row in the relationship table stores relations between two items (Example: A - B, B - C, C- D, D - E, E - F). Now we need a query that would extract relations which are circular. example A - B, B - C and C - A are circular. We have a utility that leverages this data and that simply crashes as it loops over and over because of these circular relations.

After reading more about SQL Server, I came to know that this is possible using recursive CTE but cannot get it working.

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-05-25 : 12:34:02

You might want to search on Circular Refernce to see what you find. The basic gist is you need to create a path (Materialized Path? not 100% sure if thats the term). Then you can use that path to determine if you are looping back on yourself.

If I have time today I might be able to code something up, but here is a link that might help:
http://www.sqlservercentral.com/articles/CTE/72192/

mchandrav
Starting Member

7 Posts

Posted - 2011-05-26 : 16:29:44

I tried the query mentioned in the below URL but the results seem to be way too different than what I want.

http://www.sqlservercentral.com/articles/CTE/72192/

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-05-26 : 17:51:43

I don't think your data is modeled correctly. But, here is some code based on the article I provided above that traverses the "hierarchy" and identifies circular relations:

DECLARE @Table2 TABLE (RecordNum INT, Source VARCHAR(10), Destination VARCHAR(10))

INSERT @Table2 (RecordNum, Source, Destination)
VALUES
(1, 'A1.GUID', 'A2.GUID'),
(2, 'A2.GUID', 'A3.GUID'),
(3, 'A3.GUID', 'A1.GUID'),
(4, 'A2.GUID', 'A3.GUID'),
(5, 'A3.GUID', 'A4.GUID'),
(6, 'A3.GUID', 'A2.GUID')
;
WITH Cte
AS
(
    -- Anchor member
    SELECT
        RecordNum, Source, Destination,  CAST(Source + '\' + Destination AS VARCHAR(MAX)) AS HierarchyPath, 0 AS Level
    FROM
        @Table2 AS A
    WHERE
        RecordNum = 1       

    UNION ALL

    -- Recursive member
    SELECT
        A.RecordNum, A.Source, A.Destination, CAST(B.HierarchyPath + '\' + A.Destination AS VARCHAR(MAX)), Level + 1
    FROM
        @Table2 AS A
    INNER JOIN  
        Cte AS B
        ON A.Source = B.Destination        
    WHERE
        B.HierarchyPath NOT LIKE '%' + A.Source + '\%'
)

SELECT 
    *, 
    CASE 
        WHEN HierarchyPath LIKE '%' + Destination + '\%' 
            THEN 1 
        ELSE 0 
    END AS CircularReference
FROM Cte

Probably not very helpfull as there is no real way to identify the Anchor members to start the hierarchy other than hard-coding it.

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-26 : 17:54:47

I tried to work on it when you posted this, but then moved on to other things. This is not complete, but it does identify the correct loops. It identifies them multiple times - as many times as there are nodes. All of that can be fixed and the output generated in the form you asked for.

This would have been much simpler if some constraints could be specified, for example, that there would be no parallel paths, or that there would not be duplicate source-destination pairs.

In any case, here is my half-baked attempt. You can copy the code and run it to see what it is doing.

create table dbo.#tmpTable1(record int, [name] varchar(32), [guid] varchar(32));
create table dbo.#tmpTable2(record int, [source] varchar(32), [destination] varchar(32));

insert into #tmpTable1 values ('1',' A1.Name',' A1.GUID');
insert into #tmpTable1 values ('2',' A2.Name',' A2.GUID');
insert into #tmpTable1 values ('3',' A3.Name',' A3.GUID');
insert into #tmpTable1 values ('4',' A4.Name',' A4.GUID');
insert into #tmpTable1 values ('5',' A5.Name',' A5.GUID');
insert into #tmpTable1 values ('6',' A8.Name',' A8.GUID');
insert into #tmpTable1 values ('7',' A9.Name',' A9.GUID');
insert into #tmpTable1 values ('8',' A0.Name',' A0.GUID');
insert into #tmpTable1 values ('9',' A11.Name',' A11.GUID');
insert into #tmpTable1 values ('10',' A12.Name',' A12.GUID');


insert into #tmpTable2 values ('1',' A1.GUID',' A2.GUID');
insert into #tmpTable2 values ('2',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('3',' A3.GUID',' A1.GUID');
insert into #tmpTable2 values ('4',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('6',' A2.GUID',' A8.GUID');
insert into #tmpTable2 values ('7',' A8.GUID',' A9.GUID');
insert into #tmpTable2 values ('8',' A9.GUID',' A0.GUID');

-- A parallel path. If such things could be ruled out, the code
-- could have been much simpler.
insert into #tmpTable2 values ('9',' A2.GUID',' A11.GUID');
insert into #tmpTable2 values ('10',' A11.GUID',' A12.GUID');
insert into #tmpTable2 values ('11',' A12.GUID',' A3.GUID');   


with cte1 AS -- cte1: remove all duplicates 
(
   select
      min(record) as record, 
      [source],
      [destination] 
   from 
      #tmpTable2
   group by 
      [source],[destination]
),
cte2(record,gstart,gend,cstart,found,pathid,path) as -- cte2: find all paths
(
   select
      record,
      [source],
      [destination],
      [source],
      0,
      row_number() over( order by (select null)),
      cast(record as varchar(max))
   from
      cte1
      
   union all
   
   select
      t2.record,
      t2.[source],
      t2.[destination],
      c1.cstart,
      case when c1.cstart = t2.destination then 1 else 0 end,
      c1.pathid,
      cast(t2.record as varchar(31))+':'+c1.path +
         case when c1.cstart = t2.destination then ':'+cast(t2.record as varchar(31)) else '' END
      --cast(t2.record as varchar(31))+':'+c1.path
   from
      cte1 t2
      inner join cte2 c1 on
         c1.gend = t2.[source]
   where
      c1.found = 0 
      and ':'+path+':' not like '%:'+cast(t2.record as varchar(31))+':%'
),
cte3 as -- pick only closed loops
(
   select
      *
   from
      cte2 a
   where exists 
   (
      select * 
      from cte2 b 
      where b.found = 1 and b.path like '%'+a.path+'%'
   )
)
,
cte4 as -- pick distinct paths
(
   select * 
   from cte3 ca
   where not exists (select * from cte3 cb where cb.path LIKE ca.path+'%' AND ca.path <> cb.path)
) 
select path from cte4 -- but there are still more duplicate paths, which can be eliminated.

drop table #tmpTable1;
drop table #tmpTable2;

mchandrav
Starting Member

7 Posts

Posted - 2011-05-27 : 02:28:39

Hi Sunitha and Lamprey

I really thank you for your valuable input.

Lamprey - I am still playing around with your query as I have some syntax issues (Using SQL Server).

Sunitha - I think we are close, just need a little tweak to the query. So, this is the result of your query.

1:3:2:1
10:9:1:3:11:10
3:2:1:3
3:11:10:9:1:3
9:1:3:11:10:9
1:3:11:10:9:1
2:1:3:2
11:10:9:1:3:11

It pulls the record int from the relations table. Is it possible the get the results in this way.

1. A1.Name --> A2.Name : A3.Name --> A1.Name : A2.Name --> A3.Name : A1.Name --> A2.Name
2. A11.Name --> A12.Name : A2.Name --> A11.Name : A1.Name --> A2.Name : A3.Name --> A1.Name : A12.Name --> A2.Name : A11.Name --> A12.Name

Basically interpreting the records from tmptable2 with the actual names stored in tmptable1.

Thanks
Mahesh

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-27 : 07:40:07

If you are not concerned about the duplicate paths, getting the relationships as names instead of id's is only a minor change. See below.


create table dbo.#tmpTable1(record int, [name] varchar(32), [guid] varchar(32));
create table dbo.#tmpTable2(record int, [source] varchar(32), [destination] varchar(32));

insert into #tmpTable1 values ('1',' A1.Name',' A1.GUID');
insert into #tmpTable1 values ('2',' A2.Name',' A2.GUID');
insert into #tmpTable1 values ('3',' A3.Name',' A3.GUID');
insert into #tmpTable1 values ('4',' A4.Name',' A4.GUID');
insert into #tmpTable1 values ('5',' A5.Name',' A5.GUID');
insert into #tmpTable1 values ('6',' A8.Name',' A8.GUID');
insert into #tmpTable1 values ('7',' A9.Name',' A9.GUID');
insert into #tmpTable1 values ('8',' A0.Name',' A0.GUID');
insert into #tmpTable1 values ('9',' A11.Name',' A11.GUID');
insert into #tmpTable1 values ('10',' A12.Name',' A12.GUID');


insert into #tmpTable2 values ('1',' A1.GUID',' A2.GUID');
insert into #tmpTable2 values ('2',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('3',' A3.GUID',' A1.GUID');
insert into #tmpTable2 values ('4',' A2.GUID',' A3.GUID');
insert into #tmpTable2 values ('6',' A2.GUID',' A8.GUID');
insert into #tmpTable2 values ('7',' A8.GUID',' A9.GUID');
insert into #tmpTable2 values ('8',' A9.GUID',' A0.GUID');

-- A parallel path. If such things could be ruled out, the code
-- could have been much simpler.
insert into #tmpTable2 values ('9',' A2.GUID',' A11.GUID');
insert into #tmpTable2 values ('10',' A11.GUID',' A12.GUID');
insert into #tmpTable2 values ('11',' A12.GUID',' A3.GUID');   


with cte1 AS -- cte1: remove all duplicates 
(
   select
      min(record) as record, 
      [source],
      [destination] 
   from 
      #tmpTable2
   group by 
      [source],[destination]
),
cte2(record,gstart,gend,cstart,found,pathid,path) as -- cte2: find all paths
(
   select
      c1.record,
      [source],
      [destination],
      [source],
      0,
      row_number() over( order by (select null)),
      cast(t1.Name as varchar(max))
   from
      cte1 c1
      INNER JOIN #tmpTable1 t1 ON t1.[guid] = c1.[source]
      
   union all
   
   select
      t2.record,
      t2.[source],
      t2.[destination],
      c1.cstart,
      case when c1.cstart = t2.destination then 1 else 0 end,
      c1.pathid,
      cast(t1.Name as varchar(31))+'->'+c1.path +
         case when c1.cstart = t2.destination then '->'+cast(t1.Name as varchar(31)) else '' END
      --cast(t2.record as varchar(31))+'->'+c1.path
   from
      cte1 t2
      inner join cte2 c1 on
         c1.gend = t2.[source]
      INNER JOIN #tmpTable1 t1 ON t1.[guid] = c1.gend
   where
      c1.found = 0 
      and '->'+path+'->' not like '%->'+cast(t1.Name as varchar(31))+'->%'
),
cte3 as -- pick only closed loops
(
   select
      *
   from
      cte2 a
   where exists 
   (
      select * 
      from cte2 b 
      where b.found = 1 and b.path like '%'+a.path+'%'
   )
)
,
cte4 as -- pick distinct paths
(
   select * 
   from cte3 ca
   where not exists (select * from cte3 cb where cb.path LIKE ca.path+'%' AND ca.path <> cb.path)
) 
select path from cte4 -- but there are still more duplicate paths, which can be eliminated.

drop table #tmpTable1;
drop table #tmpTable2;

mchandrav
Starting Member

7 Posts

Posted - 2011-05-27 : 14:20:25

Hi Sunita

Thanks a lot. That query worked.

Regards
Mahesh

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources