| Author |
Topic  |
|
|
GarDebs2010
Starting Member
4 Posts |
Posted - 09/27/2012 : 15:25:10
|
This one has had me stumped for the last few hours and at this stage I think I need some help...
I need to compare multiple groups from a single table and to identify where items listed in col B match. For example: -
Col A...............Col B John................Apple John................Orange John................Banana Mary................Orange Mary................Strawberry David...............Apple David...............Orange David...............Banana I want 'John' and 'David' returned because their items in col B match. Hope this makes sense! Thanks in advance! G |
|
|
sunitabeck
Flowing Fount of Yak Knowledge
5152 Posts |
Posted - 09/27/2012 : 15:43:47
|
Probably not the most elegant or most efficient, (and probably a little silly) but here is one way:CREATE TABLE #tmp (cola VARCHAR(32), colb VARCHAR(32))
INSERT INTO #tmp VALUES
('John','Apple'),
('John','Orange'),
('John','Banana'),
('Mary','Orange'),
('Mary','Strawberry'),
('David','Apple'),
('David','Orange'),
('David','Banana');
;WITH cte AS
(
SELECT
a.cola,
b.colbs
FROM
(SELECT DISTINCT cola FROM #tmp) a
CROSS APPLY
(
SELECT colb AS [text()] FROM #tmp c
WHERE c.cola = a.cola
ORDER BY c.colb
FOR XML PATH('')
) b(colbs)
)
SELECT
a.cola,
b.cola
FROM
cte a INNER JOIN cte b ON a.colbs = b.colbs AND a.cola <> b.cola
AND a.cola < b.cola;
DROP TABLE #tmp; |
Edited by - sunitabeck on 09/27/2012 15:52:27 |
 |
|
|
bitsmed
Yak Posting Veteran
Denmark
98 Posts |
Posted - 09/27/2012 : 16:34:49
|
Alternative to sunitabeck's solution
select a.col_a
from tablename as a
left outer join tablename as b
on b.col_b=a.col_b
and b.col_a<>a.col_a
group by a.col_a
having sum(case when b.col_b is null then 1 else 0 end)=0
Edit: my above sql attempt, did only work with the supplied test data, so I have rewritten the sql to fit all scenarios (I hope)
select col_a
from (select col_a
,count(col_b) col_b_count
from tablename
group by col_a
) as a
inner join (select a.col_a as col_a1
,b.col_a as col_a2
,count(b.col_b) col_b_count
from tablename as a
inner join tablename as b
on b.col_a<>a.col_a
and b.col_b=a.col_b
group by a.col_a
,b.col_a
) as b
on b.col_a1=a.col_a
where a.col_b_count=b.col_b_count |
Edited by - bitsmed on 09/27/2012 17:29:21 |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
India
47081 Posts |
Posted - 09/27/2012 : 17:37:45
|
quote: Originally posted by GarDebs2010
This one has had me stumped for the last few hours and at this stage I think I need some help...
I need to compare multiple groups from a single table and to identify where items listed in col B match. For example: -
Col A...............Col B John................Apple John................Orange John................Banana Mary................Orange Mary................Strawberry David...............Apple David...............Orange David...............Banana I want 'John' and 'David' returned because their items in col B match. Hope this makes sense! Thanks in advance! G
declare @test table
(
ColA varchar(100),
ColB varchar(100)
)
insert @test
values('John','Apple'),
('John','Orange'),
('John','Banana'),
('Mary','Orange'),
('Mary','Strawberry'),
('David','Apple'),
('David','Orange'),
('David','Banana')
;With Temp
AS
(
SELECT ColA,
ColB,
STUFF ((SELECT ','+ ColB FROM @test WHERE ColA = t.ColA ORDER BY ColB FOR XML PATH('')),1,1,'') AS ColBList
FROM @test t
)
SELECT t.ColA,t.ColB
FROM Temp t
INNER JOIN (SELECT ColBList
FROM Temp
GROUP BY ColBList
HAVING COUNT(DISTINCT ColA) > 1
)t1
ON t1.ColBList = t.ColBList
output
----------------------------
ColA ColB
----------------------------
John Apple
John Orange
John Banana
David Apple
David Orange
David Banana
------------------------------------------------------------------------------------------------------ SQL Server MVP http://visakhm.blogspot.com/
|
Edited by - visakh16 on 09/27/2012 17:49:40 |
 |
|
|
sunitabeck
Flowing Fount of Yak Knowledge
5152 Posts |
Posted - 09/28/2012 : 07:34:04
|
If you can put up with the slight possibility (albeit very very slight) that binary_checksums may be the same for different strings or that lightning might strike and two different sets of binary_checksums might add up to the same sum, the following may be a little bit faster:
CREATE TABLE #tmp (cola VARCHAR(32), colb VARCHAR(32))
INSERT INTO #tmp VALUES
('John','Apple'),
('John','Orange'),
('John','Banana'),
('Mary','Orange'),
('Mary','Strawberry'),
('David','Apple'),
('David','Orange'),
('David','Banana');
;WITH cte as
(
SELECT
cola,
SUM(BINARY_CHECKSUM(colb)) AS bcs
FROM
#tmp
GROUP BY cola
)
SELECT
a.cola
FROM
cte A
WHERE EXISTS
(
SELECT 1 FROM cte b
WHERE b.bcs = a.bcs
GROUP BY bcs HAVING COUNT(*) > 1
)
|
 |
|
| |
Topic  |
|
|
|