Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Eliminating duplicates puzzle

Author  Topic 

Abu-Dina
Posting Yak Master

206 Posts

Posted - 2008-06-26 : 09:36:24
Greetings all,

I've been battling with this for the last couple of hours but to no avail. I have the following record set:


create table #Test (
Id int identity(1, 1)
,Address1 varchar(50)
,Address2 varchar(50)
,Address3 varchar(50)
,AddressFlag char(1)
,AddressScore tinyint
)

insert into #Test(Address1, Address2, Address3, AddressFlag, AddressScore)
select 'AAA', 'BBB', 'CCC', 'Y', 4 union all
select 'AAA', 'BBB', 'CCC', 'Y', 5 union all
select 'AAA', 'BBB', 'CCC', 'N', 5 union all
select 'AAA', 'BBB', 'CCC', 'X', 7 union all
select 'DDD', 'EEE', 'FFF', 'N', 5 union all
select 'DDD', 'EEE', 'FFF', 'N', 5 union all
select 'DDD', 'EEE', 'FFF', 'X', 6 union all
select 'DDD', 'EEE', 'FFF', 'B', 6 union all
select 'DDD', 'EEE', 'FFF', 'B', 7 union all
select 'HHH', 'III', 'JJJ', 'Y', 7 union all
select 'HHH', 'III', 'JJJ', 'Y', 7 union all
select 'HHH', 'III', 'JJJ', 'Y', 8


As you can see, records with ID = 1, 2, 3 and 4 have the the same Address1, Address2 and Address3. That constitutes a duplicate and so I want to remove two and keep one of the records.

The rule for removing the duplicates is dependant on the two remaining columns.

If AddressFlag = Y then keep that record and delete other ones.
If the AddressFlag is also the same then keep only the record with higherst AddressScore.

If AddressScor is also the same then keep only record with minimum Id.

Now if ou notice, record with Ids 5, 6, 7, 8 and 9 have no Y in the Address flag. In this case the one to keep is B because it takes precendence over N X. Basically the precedence for the AddressFag is Y then B then N then X.

Hope you can help me with this.

Regards.

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2008-06-26 : 10:00:01
this is what you want ?
IF object_id('tempdb..#Test') IS NULL
BEGIN
CREATE TABLE #Test
(
Id int identity(1, 1)
,Address1 varchar(10)
,Address2 varchar(10)
,Address3 varchar(10)
,AddressFlag CHAR(1)
,AddressScore tinyint
)

truncate TABLE #Test
INSERT INTO #Test(Address1, Address2, Address3, AddressFlag, AddressScore)
SELECT 'AAA', 'BBB', 'CCC', 'Y', 4 UNION ALL
SELECT 'AAA', 'BBB', 'CCC', 'Y', 5 UNION ALL
SELECT 'AAA', 'BBB', 'CCC', 'N', 5 UNION ALL
SELECT 'AAA', 'BBB', 'CCC', 'X', 7 UNION ALL
SELECT 'DDD', 'EEE', 'FFF', 'N', 5 UNION ALL
SELECT 'DDD', 'EEE', 'FFF', 'N', 5 UNION ALL
SELECT 'DDD', 'EEE', 'FFF', 'X', 6 UNION ALL
SELECT 'DDD', 'EEE', 'FFF', 'B', 6 UNION ALL
SELECT 'DDD', 'EEE', 'FFF', 'B', 7 UNION ALL
SELECT 'HHH', 'III', 'JJJ', 'Y', 7 UNION ALL
SELECT 'HHH', 'III', 'JJJ', 'Y', 7 UNION ALL
SELECT 'HHH', 'III', 'JJJ', 'Y', 8
END

DELETE d
FROM (
SELECT *, row_no = row_number() OVER (PARTITION BY Address1, Address2, Address3
ORDER BY CASE AddressFlag
WHEN 'Y' THEN 1
WHEN 'B' THEN 2
WHEN 'N' THEN 3
WHEN 'X' THEN 4
END, AddressScore, Id)
FROM #Test
WHERE AddressFlag <> 'Y'
) d
WHERE row_no <> 1

SELECT *
FROM #Test

/*

Id Address1 Address2 Address3 AddressFlag AddressScore
----------- ---------- ---------- ---------- ----------- ------------
1 AAA BBB CCC Y 4
2 AAA BBB CCC Y 5
3 AAA BBB CCC N 5
8 DDD EEE FFF B 6
10 HHH III JJJ Y 7
11 HHH III JJJ Y 7
12 HHH III JJJ Y 8

(7 row(s) affected)
*/



KH
[spoiler]Time is always against us[/spoiler]

Go to Top of Page

khtan
In (Som, Ni, Yak)

17689 Posts

Posted - 2008-06-26 : 10:01:33
oh . . this thread is continue from http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=105524


KH
[spoiler]Time is always against us[/spoiler]

Go to Top of Page
   

- Advertisement -