SQL Server Forums
Profile | Register | Active Topics | Members | Search | Forum FAQ
 
Register Now and get your question answered!
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 SQL Server 2012 Forums
 Transact-SQL (2012)
 Deduplicating a List
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

newnoise
Starting Member

3 Posts

Posted - 06/24/2013 :  19:20:00  Show Profile  Reply with Quote
I have a table with 3 columns: unique ID, spouse ID, and a market rating. I'm trying to figure out how to deduplicate the list so that only one spouse per married couple appears in the list. Here's an example:

ID SpouseID Rating
1 NULL High
2 5 Low
3 4 NULL
4 3 Low
5 2 NULL
6 NULL NULL
7 10 High
8 15 Low
9 22 NULL
10 7 NULL


I want to come up with code that will select only one member per couple included in the list but to privilege the spouse that has a value in the rating column. One possible fix I came up with (that doesn't work) is the following:

select distinct
case
when ID > spouseID then ID
when spouseID > ID then spouseID
else ID
end
from table


If you look at the original data, you'll see that the problem with this code is that it will end up keeping ID 5 and 10, even though those cases have a NULL value in the Rating column. If I were to switch the sign of the code from "greater than" (>) to "less than" (<), the code would keep ID 3, even though it has a NULL value in the Rating column.

My question, then: How do I get this list of 10 cases to return the following IDs: 1, 2, 4, 6, 7, 8, and 9?

SQL Server 2012

MuMu88
Aged Yak Warrior

547 Posts

Posted - 06/24/2013 :  20:49:36  Show Profile  Reply with Quote
Just add a where clause

select distinct
case
when ID > spouseID then ID
when spouseID > ID then spouseID
else ID
end
from table
WHERE Rating IS NOT NULL


Edited by - MuMu88 on 06/24/2013 20:50:03
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 06/25/2013 :  01:20:08  Show Profile  Reply with Quote

SELECT t.ID,t.SpouseID,COALESCE(t.Rating,t1.Rating) AS Rating
FROM table t
OUTER APPLY (SELECT COUNT(*) AS Cnt
             FROM table 
             WHERE ID = t.SpouseID
             AND SpouseID = t.ID
             AND Rating IS NOT NULL
            )t1
WHERE COALESCE(Cnt,0)=0



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

MuMu88
Aged Yak Warrior

547 Posts

Posted - 06/25/2013 :  13:24:04  Show Profile  Reply with Quote
Is this what you want:


; WITH CTE AS 
(SELECT ID FROM @Temp where Rating is not NULL)
SELECT ID FROM @Temp T WHERE ID NOT IN(SELECT ID FROM  CTE) and 
	COALESCE(SpouseID, 0) NOT IN (SELECT ID FROM  CTE)
UNION SELECT ID from CTE;

Go to Top of Page

James K
Flowing Fount of Yak Knowledge

3719 Posts

Posted - 06/25/2013 :  14:01:15  Show Profile  Reply with Quote
quote:
Originally posted by visakh16


SELECT t.ID,t.SpouseID,COALESCE(t.Rating,t1.Rating) AS Rating
FROM table t
OUTER APPLY (SELECT COUNT(*) AS Cnt
             FROM table 
             WHERE ID = t.SpouseID
             AND SpouseID = t.ID
             AND Rating IS NOT NULL
            )t1
WHERE COALESCE(Cnt,0)=0



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs


SELECT t.ID,t.SpouseID,COALESCE(t.Rating,t1.Rating) AS Rating
....
t1 does not have a rating column, and even if I add that to the inner query, the logic doesn't seem right.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 06/26/2013 :  02:52:42  Show Profile  Reply with Quote
ah...sorry...couldnt test it then


declare @temp table
(
ID int,
SpouseID int,
Rating varchar(10)
)

insert @temp
values(1, NULL, 'High'),
(2, 5, 'Low'),
(3, 4, NULL),
(4, 3, 'Low'),
(5, 2, NULL),
(6, NULL, NULL),
(7, 10, 'High'),
(8, 15, 'Low'),
(9, 22, NULL),
(10, 7, NULL),
(22,9,NULL)

SELECT t.ID,t.SpouseID,t.Rating
FROM @temp t
OUTER APPLY (SELECT COUNT(CASE WHEN Rating IS NOT NULL THEN 1 END) AS RCnt,
             COUNT(*) AS Cnt
             FROM @temp 
             WHERE ID = t.SpouseID
             AND SpouseID = t.ID
            )t1
WHERE (Rating IS NOT NULL AND COALESCE(RCnt,0)=0)
OR (ID > COALESCE(SpouseID,-1) AND COALESCE(Cnt,0) = 1)



----------------------------------
ID	SpouseID	Rating
----------------------------------
1	NULL	        High
2	5	        Low
4	3	        Low
5	2	        NULL
6	NULL	        NULL
7	10	        High
8	15	        Low
10	7	        NULL
22	9	        NULL



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

James K
Flowing Fount of Yak Knowledge

3719 Posts

Posted - 06/26/2013 :  09:20:05  Show Profile  Reply with Quote
quote:
Originally posted by visakh16

ah...sorry...couldnt test it then


declare @temp table
(
ID int,
SpouseID int,
Rating varchar(10)
)

insert @temp
values(1, NULL, 'High'),
(2, 5, 'Low'),
(3, 4, NULL),
(4, 3, 'Low'),
(5, 2, NULL),
(6, NULL, NULL),
(7, 10, 'High'),
(8, 15, 'Low'),
(9, 22, NULL),
(10, 7, NULL),
(22,9,NULL)

SELECT t.ID,t.SpouseID,t.Rating
FROM @temp t
OUTER APPLY (SELECT COUNT(CASE WHEN Rating IS NOT NULL THEN 1 END) AS RCnt,
             COUNT(*) AS Cnt
             FROM @temp 
             WHERE ID = t.SpouseID
             AND SpouseID = t.ID
            )t1
WHERE (Rating IS NOT NULL AND COALESCE(RCnt,0)=0)
OR (ID > COALESCE(SpouseID,-1) AND COALESCE(Cnt,0) = 1)



----------------------------------
ID	SpouseID	Rating
----------------------------------
1	NULL	        High
2	5	        Low
4	3	        Low
5	2	        NULL
6	NULL	        NULL
7	10	        High
8	15	        Low
10	7	        NULL
22	9	        NULL



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs


Something doesn't seem right in that solution - for example, the (5,2) pair is repeated (as 5,2 and 2,5).

BTW, newnoise, MuMu's solution produces the right results without any duplicates.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 06/26/2013 :  09:34:16  Show Profile  Reply with Quote
quote:
Originally posted by James K

quote:
Originally posted by visakh16

ah...sorry...couldnt test it then


declare @temp table
(
ID int,
SpouseID int,
Rating varchar(10)
)

insert @temp
values(1, NULL, 'High'),
(2, 5, 'Low'),
(3, 4, NULL),
(4, 3, 'Low'),
(5, 2, NULL),
(6, NULL, NULL),
(7, 10, 'High'),
(8, 15, 'Low'),
(9, 22, NULL),
(10, 7, NULL),
(22,9,NULL)

SELECT t.ID,t.SpouseID,t.Rating
FROM @temp t
OUTER APPLY (SELECT COUNT(CASE WHEN Rating IS NOT NULL THEN 1 END) AS RCnt,
             COUNT(*) AS Cnt
             FROM @temp 
             WHERE ID = t.SpouseID
             AND SpouseID = t.ID
            )t1
WHERE (Rating IS NOT NULL AND COALESCE(RCnt,0)=0)
OR (ID > COALESCE(SpouseID,-1) AND COALESCE(Cnt,0) = 1)



----------------------------------
ID	SpouseID	Rating
----------------------------------
1	NULL	        High
2	5	        Low
4	3	        Low
5	2	        NULL
6	NULL	        NULL
7	10	        High
8	15	        Low
10	7	        NULL
22	9	        NULL



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs


Something doesn't seem right in that solution - for example, the (5,2) pair is repeated (as 5,2 and 2,5).

BTW, newnoise, MuMu's solution produces the right results without any duplicates.



for the OP's sample data even this works

declare @temp table
(
ID int,
SpouseID int,
Rating varchar(10)
)

insert @temp
values(1, NULL, 'High'),
(2, 5, 'Low'),
(3, 4, NULL),
(4, 3, 'Low'),
(5, 2, NULL),
(6, NULL, NULL),
(7, 10, 'High'),
(8, 15, 'Low'),
(9, 22, NULL),
(10, 7, NULL)

SELECT t.ID,t.SpouseID,t.Rating
FROM @temp t
OUTER APPLY (SELECT COUNT(*) AS Cnt
             FROM @temp 
             WHERE ID = t.SpouseID
             AND SpouseID = t.ID
             AND Rating IS NOT NULL
            )t1
WHERE COALESCE(Cnt,0)=0


output
----------------------------------
ID	SpouseID	Rating
---------------------------------
1	NULL	High
2	5	Low
4	3	Low
6	NULL	NULL
7	10	High
8	15	Low
9	22	NULL






I was trying to include an additional possibility of duplicates with both NULLs

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

India
52325 Posts

Posted - 06/26/2013 :  10:02:09  Show Profile  Reply with Quote
This solution works for all scenarios ie including multiple NULL records as well as multiple ratings choosing one with highest rating


declare @temp table
(
ID int,
SpouseID int,
Rating varchar(10)
)

insert @temp
values(1, NULL, 'High'),
(2, 5, 'Low'),
(3, 4, NULL),
(4, 3, 'Low'),
(5, 2, NULL),
(6, NULL, NULL),
(7, 10, 'High'),
(8, 15, 'Low'),
(9, 22, NULL),
(10, 7, NULL),
(11,13,'Low'),
(13,11,'High'),
(22, 9, NULL)

SELECT ID,SpouseID,Rating
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY CASE WHEN ID > SpouseID THEN ID ELSE SpouseID END,CASE WHEN ID > SpouseID THEN SpouseID ELSE ID END ORDER BY CASE Rating WHEN 'High' THEN 2 WHEN 'Low' THEN 1 ELSE 0 END  DESC) AS Seq,*
FROM @temp
)t
WHERE Seq=1
ORDER BY ID


output
---------------------------------
ID	SpouseID	Rating
---------------------------------
1	NULL	High
2	5	Low
4	3	Low
6	NULL	NULL
7	10	High
8	15	Low
9	22	NULL
13	11	High




------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
SQL Server Forums © 2000-2009 SQLTeam Publishing, LLC Go To Top Of Page
This page was generated in 0.11 seconds. Powered By: Snitz Forums 2000