Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 filtering out duplicate rows based on three column

Author  Topic 

anjali5
Posting Yak Master

121 Posts

Posted - 2014-11-25 : 13:46:21
Hi All,

I want to filter out the duplicate rows based on three columns. I got this quick query from Microsoft site to filter out the duplicate rows, but I am getting the result that filters out the non-duplicate one too. Below is the query


;With Temp as (
SELECT row_number() over (partition by [id],[p_date],
  order by [id],[p_date],[code]) as Row,
[id]
,[p_date]
,[code]

from
test
)
select * from Temp where Row = 1




I am getting the following results
[code]

1 NULL 2014-09-12 04:53:00 A
1 NULL 2014-09-12 05:26:00 P
1 NULL 2014-09-12 05:31:00 P

In the above case id is null, but in some rows id is not null . The above is obviously not duplicate. I am not sure what am I doing wrong.

any help will be appreciated.

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-11-25 : 14:12:06
I don't understand your post. It seems to contradict itself. Please show us sample data with the duplicates, and then using that same sample data show us what your query should output.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2014-11-25 : 14:12:07
The query (which filters by two columns, not three) returns the unique rows, not the duplicates. Your results are the unique rows by id and p_date. It goes without saying that the rest of the rows in the table are duplicates of one of these three.
Go to Top of Page

anjali5
Posting Yak Master

121 Posts

Posted - 2014-11-25 : 14:27:57
below is the sample data



1 2014-09-03 04:48:00 NULL
2 2014-09-03 04:49:00 NULL
3 2014-09-03 04:50:00 NULL
4 2014-09-03 04:51:00 NULL
5 2014-09-03 04:52:00 NULL
NULL 2014-09-03 04:47:00 1234
NULL 2014-09-03 04:47:00 1234
NULL 2014-09-03 04:47:00 1234
NULL 2014-09-03 04:48:00 M


I only want these rows from above data

NULL 2014-09-03 04:47:00 1234
NULL 2014-09-03 04:47:00 1234
NULL 2014-09-03 04:47:00 1234


below is the insert statement

INSERT INTO [SportsStore].[dbo].[Test]
([ID]
,[p_date]
,[Code])
VALUES
(
NULL
,'2014-09-03 04:48:00'
,'M')


SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Test](
[ID] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[p_date] [smalldatetime] NULL,
[Code] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF



Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-11-25 : 14:32:56
Since they are duplicates, do you need all 3 returned? If you just need 1, here you go:

select ID, p_date, Code, count(*)
from Test
group by ID, p_date, Code
having count(*) > 1

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

anjali5
Posting Yak Master

121 Posts

Posted - 2014-11-25 : 14:34:27
quote:
Originally posted by tkizer

Since they are duplicates, do you need all 3 returned? If you just need 1, here you go:

select ID, p_date, Code, count(*)
from Test
group by ID, p_date, Code
having count(*) > 1

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/



I need all three to be returned.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-11-25 : 14:57:33
[code]
;With Temp
as
(
select ID, p_date, Code
from Test
group by ID, p_date, Code
having count(*) > 1
)
select Temp.*
from Temp
join Test
on COALESCE(Test.ID, '') = COALESCE(Temp.ID, '') and Temp.p_date = Test.p_date and Temp.Code = Test.Code

[/code]

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

anjali5
Posting Yak Master

121 Posts

Posted - 2014-11-26 : 01:28:19
quote:
Originally posted by tkizer


;With Temp
as
(
select ID, p_date, Code
from Test
group by ID, p_date, Code
having count(*) > 1
)
select Temp.*
from Temp
join Test
on COALESCE(Test.ID, '') = COALESCE(Temp.ID, '') and Temp.p_date = Test.p_date and Temp.Code = Test.Code



Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/



Thanks.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-11-26 : 12:27:03


Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page
   

- Advertisement -