Author |
Topic |
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-06-05 : 22:02:59
|
ok folks, I need help.Here's the table and some sample data:declare @t table ( segment int, subsegment int, primary key (segment,subsegment))insert @t select 1,33 union all select 1,22 union allselect 2,33 union allselect 2,22 union allselect 3,33 union allselect 3,22 union allselect 3,44 What I want is to find all segments that are in some sense complete duplicates of other segments. a segment is made up of subsegments. a subsegment is not a segment - it's a completely different entity. this table is not hierarchical. So in the sample data above, segments 1 and 2 are dupes because they share exactly the same subsegments: 22 and 33. Segment 3 is not a dupe because it has a third subsegment the other two don't have: 44.when a duped segment is found, I need to know which other segment it duplicates. so an acceptable result set for the above sample data would be:segment partner------- -------1 2 this would also be fine:segment partner------- -------1 22 1 ps: i already posted this on dbforums - just broadening the audience a little. elsasoft.org |
|
dshelton
Yak Posting Veteran
73 Posts |
Posted - 2008-06-05 : 23:41:09
|
This works for me:SELECT a.segment, b.segment FROM @t aJOIN @t bON a.subsegment = b.subsegmentAND a.segment <> b.segmentJOIN (SELECT segment, COUNT(*) AS total FROM @t GROUP BY segment) AS cON a.segment = c.segmentJOIN (SELECT segment, COUNT(*) AS total FROM @t GROUP BY segment) AS dON b.segment = d.segmentGROUP BY a.segment, b.segment, c.total, d.totalHAVING c.total = COUNT(DISTINCT a.subsegment)AND d.total = COUNT(DISTINCT b.subsegment) |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-06-06 : 00:37:56
|
perfect, now that is impressive!In my case, @t is a permanent table with 4535149 rows and your query finishes in only 17 sec! what's more it appears to do the right thing!If you want I'll happily give you a free license to SqlSpec for that awesome query. lemme know. elsasoft.org |
|
|
raja_saminathan
Starting Member
12 Posts |
Posted - 2008-06-06 : 03:00:54
|
Try this alsoDeclare @r intselect @r = segment from #t where subsegment =(select subsegment from #tgroup by subsegmenthaving count(subsegment )=1)SELECT Distinct a.segment, b.segment FROM #t aJOIN #t bON a.subsegment = b.subsegmentAND a.segment <> b.segmentwhere a.segment !=@R and b.segment !=@RRajesh |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-06-06 : 03:55:46
|
quote: Originally posted by jezemine just broadening the audience a little.
The audience is listening...This is my stab at it;WITH Yak(Segment, SubSegment)AS ( SELECT s.Segment, g.SubSegment FROM ( SELECT Segment FROM @t GROUP BY Segment ) AS s CROSS APPLY ( SELECT DISTINCT ',' + CAST(y.SubSegment AS VARCHAR(12)) FROM @t AS y WHERE y.Segment = s.Segment ORDER BY ',' + CAST(y.SubSegment AS VARCHAR(12)) FOR XML PATH('') ) AS g(Subsegment))SELECT y1.Segment, y2.Segment AS PartnerFROM Yak AS y1INNER JOIN Yak AS y2 ON y2.SubSegment = y1.SubSegmentWHERE y1.Segment < y2.Segment EDIT: No need for DISTINCT since subsegment is part of primary key. E 12°55'05.25"N 56°04'39.16" |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-06-06 : 04:17:43
|
A slight variation of Shelton'sSELECT t1.Segment, t2.Segment AS [Partner]FROM @t AS t1INNER JOIN @t AS t2 ON t2.SubSegment = t1.SubSegmentINNER JOIN ( SELECT Segment, COUNT(*) AS Items FROM @t GROUP BY Segment ) AS x ON x.Segment IN (t1.Segment, t2.Segment)WHERE t1.Segment < t2.SegmentGROUP BY t1.Segment, t2.SegmentHAVING MIN(x.Items) = MAX(x.Items) E 12°55'05.25"N 56°04'39.16" |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-06-06 : 04:21:42
|
Another variant:-SELECT MIN(t.segment),MAX(t.segment),LEFT(b.list,LEN(b.list)-1) as sslist FROM (select distinct segment from @t) tCROSS APPLY(SELECT cast(subsegment as varchar(5)) + ',' as [text()] FROM @t WHERE segment=t.segment FOR XML PATH(''))b(list)GROUP BY LEFT(b.list,LEN(b.list)-1)HAVING MAX(t.segment)<>MIN(t.segment) |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-06-06 : 04:39:17
|
Visakh, add these sample dataselect 4,33 union all select 4,22 union all E 12°55'05.25"N 56°04'39.16" |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-06-06 : 05:01:21
|
Ok. Added and solution amended;With CTE (segment,list) AS(SELECT t.segment,LEFT(b.list,LEN(b.list)-1) as sslist FROM (select distinct segment from @t) tCROSS APPLY(SELECT cast(subsegment as varchar(5)) + ',' as [text()] FROM @t WHERE segment=t.segment ORDER BY subsegment FOR XML PATH(''))b(list))SELECT * FROM CTE c1INNER JOIN CTE c2ON c2.list=c1.listand c2.segment <> c1.segment |
|
|
ayamas
Aged Yak Warrior
552 Posts |
Posted - 2008-06-06 : 05:25:14
|
Here is my try at this one.;with rollup(segment,subsegment)as( select segment,(select cast(subsegment as varchar(20))+ ',' from @t t1 where t1.segment=t.segment for xml path(''))as subsegmentfrom @t t group by segment)select t.segment,b.segment as partner,subsegment from rollup t cross apply( select segment,row_number() over(order by subsegment desc)as rowid from rollup where subsegment=t.subsegment)b where b.segment<>t.segment |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-06-06 : 10:28:14
|
you guys are great. I'd use this for an interview question, but I think it would take the whole hour. Plus, it's not fair to use it if I couldn't do it myself, right? elsasoft.org |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-06-06 : 10:31:38
|
quote: Originally posted by jezemine you guys are great. I'd use this for an interview question, but I think it would take the whole hour. Plus, it's not fair to use it if I couldn't do it myself, right? elsasoft.org
No probs. Most interviewers ask only those questions for which they have flunked themselves |
|
|
Page47
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2008-06-06 : 11:15:51
|
Fun. How does this version stack up for performance?EDIT: blah, nevermind ... mine is pretty same as Peso's, but use different methods to materialize the list. (I wonder which is better? ... off to test now).declare @t2 table( s int, l varchar(max), primary key (s));with yak( s, ls, l)as( select segment as 's', min(subsegment) as 'ls', '.' + convert(varchar(max), min(subsegment)) as 'l' from @t group by segment union all select t.segment, t.subsegment as 'ls', y.l+'.'+convert(varchar(max),t.subsegment) as 'l' from @t t inner join yak y on t.segment = y.s and t.subsegment > y.ls and not exists ( select 1 from @t t1 where t.segment = t1.segment and t1.subsegment > y.ls and t.subsegment > t1.subsegment))insert into @t2select s, max(l)from yakgroup by sselect t1.s as 'segment', t2.s as 'partner'from @t2 t1 inner join @t2 t2 on t1.l = t2.l and t1.s <> t2.s Jayto here knows when |
|
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-06-06 : 11:22:56
|
I don't have time right now to compare all these solutions for perf on my larger table, but I promise I will in the next day or two! elsasoft.org |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2008-06-06 : 12:12:57
|
Short and sweet:SELECT A.segment, B.segmentFROM @t AS AINNER JOIN @t AS B ON A.segment <> B.segmentGROUP BY A.segment, B.segmentHAVING POWER(COUNT(CASE WHEN A.subsegment = B.subsegment THEN 1 END),2) = COUNT(*) (and also very, very slow!) |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-06-06 : 12:43:36
|
quote: Originally posted by Arnold Fribble Short and sweet:SELECT A.segment, B.segmentFROM @t AS AINNER JOIN @t AS B ON A.segment <> B.segmentGROUP BY A.segment, B.segmentHAVING POWER(COUNT(CASE WHEN A.subsegment = B.subsegment THEN 1 END),2) = COUNT(*) (and also very, very slow!)
I like that |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-06-10 : 03:34:40
|
Doesn't really work with this sample dataselect 6,1 union all select 5,1 union all E 12°55'05.25"N 56°04'39.16" |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-06-10 : 03:39:54
|
Here is another approach with "partnergroups" instead of single partner.create table #t( segment int, subsegment int, primary key (segment,subsegment))insert #t select 6,1 union all select 5,1 union allselect 4,33 union all select 4,22 union all select 1,33 union all select 1,22 union allselect 2,33 union allselect 2,22 union allselect 3,33 union allselect 3,22 union allselect 3,44declare @sql varchar(max)set @sql = ''select @sql = @sql + ',' + quotename(subsegment)from ( select subsegment from #t group by subsegment ) AS qset @sql = stuff(@sql, 1, 1, '')set @sql = '-- peso 2select segment, partnergroupfrom ( select segment, partnergroup, count(*) over (partition by partnergroup) AS t from ( SELECT p.segment, rank() over (order by ' + @sql + ') as partnergroup FROM #t PIVOT ( MAX(subsegment) for subsegment in (' + @sql + ') ) AS p ) as r ) AS ywhere t > 1'exec(@sql)drop table #t E 12°55'05.25"N 56°04'39.16" |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2008-06-10 : 05:19:09
|
how about this;:declare @t table ( segment int, subsegment int, primary key (segment,subsegment))insert @t select 1,33 union all select 1,22 union allselect 2,33 union allselect 2,22 union allselect 3,33 union allselect 3,22 union allselect 3,44 union allselect 4,33 union all select 4,22 union all select 5,30 union all select 5,25 union all select 6,33 union all select 6,22 declare @t1 table (segment int, prodSum int)insert into @t1select segment, sum(RowNum * subsegment) as prodSumfrom( SELECT ROW_NUMBER() OVER (PARTITION BY segment ORDER BY segment) as RowNum, segment, subsegment FROM @t) t1group by segmentselect t2.segment from ( select prodSum from @t1 group by prodSum having COUNT(*) > 1 ) t1 join @t1 t2 on t1.prodSum = t2.prodSumorder by t2.segment _______________________________________________Causing trouble since 1980Blog: http://weblogs.sqlteam.com/mladenpSpeed up SSMS development: www.ssmstoolspack.com <- version 1.0 out! |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-06-10 : 05:36:49
|
ORDER BY Subsegment to insure same order.But both approaches will fail. Include this sample data!select -1,88 union all E 12°55'05.25"N 56°04'39.16" |
|
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2008-06-10 : 05:48:24
|
-1? who on earth uses negative numbes for keys? point taken though_______________________________________________Causing trouble since 1980Blog: http://weblogs.sqlteam.com/mladenpSpeed up SSMS development: www.ssmstoolspack.com <- version 1.0 out! |
|
|
Next Page
|