Finding Duplicates.. sort of - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Finding Duplicates.. sort of

Author

Topic

ShooterJ07
Starting Member

17 Posts

Posted - 2010-01-12 : 09:26:40

For the sake of simplicity, I'll use a generic example.

I have one table - call it Students. Its fields are:
- StudentID (Unique, primary key)
- FirstName
- LastName
- DOB

I need to find the students that have the same FirstName, LastName, and DOB, but different StudentID's. Since StudentID is unique, I think I can use this:

SELECT DISTINCT(LastName), FirstName, DOB FROM Students GROUP BY Lastname, FirstName, DOB HAVING COUNT(*) > 1

However, I need StudentID to be included in my results. I know this should be very simple, but could anyone point me in the right direction?

--------------------------

Example Table:

StudentID, FirstName, LastName, DOB
1,John,Doe,1/1/2010
2,John,Doe,1/1/2010
3,John,Doe,5/5/2005
4,Jane,Smith,1/1/2010

Query should return:

StudentID, FirstName, LastName, DOB
1,John,Doe,1/1/2010
2,John,Doe,1/1/2010

DP978
Constraint Violating Yak Guru

269 Posts

Posted - 2010-01-12 : 09:44:35

[code]Declare @Students table (StudentID Int, FirstName Varchar(20), LastName Varchar(20), DOB Datetime)
Insert @Students

Select 1,'John','Doe','1/1/2010' Union ALL
Select 2,'John','Doe','1/1/2010' Union ALL
Select 3,'John','Doe','5/5/2005' Union ALL
Select 4,'Jane','Smith','1/1/2010'

Select a.* from @Students a Inner Join
( Select FirstName, LastName, DOB, Count(*) as cnt
from @Students
Group by
FirstName, LastName, DOB
) b
ON a.firstname = b.firstname and a.LastName = b.LastName and a.DOB = b.DOB
Where b.cnt > 1[/code]

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2010-01-12 : 09:44:43

select t1.* from Students as t1
inner join
(
SELECT LastName, FirstName, DOB FROM Students
GROUP BY Lastname, FirstName, DOB HAVING COUNT(*) > 1
) as t2
on t1.LastName=t2.LastName and t1.FirstName=t2.FirstName and t1.DOB=t2.DOB

Madhivanan

Failing to plan is Planning to fail

RickD
Slow But Sure Yak Herding Master

3608 Posts

Posted - 2010-01-12 : 09:46:39

[code]
SELECT b.StudentId, a.LastName, a.FirstName, a.DOB
FROM (
SELECT DISTINCT LastName, FirstName, DOB
FROM Students GROUP BY Lastname, FirstName, DOB HAVING COUNT(*) > 1) a
Join Students b
ON b.LastName = a.LastName
AND b.FirstName = a.FirstName
AND b.DOB = a.DOB
[/code]
Bah, I should really refresh before posting..

DP978
Constraint Violating Yak Guru

269 Posts

Posted - 2010-01-12 : 09:59:39

Rick, I do the same thing all the time (I'm usually beat by the regulars)! and grats Madi on 18,500 posts! thats crazy... heh

ShooterJ07
Starting Member

17 Posts

Posted - 2010-01-12 : 10:15:05

Makes complete sense. Thanks everyone.

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2010-01-13 : 01:07:12

quote:
Originally posted by DP978

Rick, I do the same thing all the time (I'm usually beat by the regulars)! and grats Madi on 18,500 posts! thats crazy... heh

. This time you were faster

Madhivanan

Failing to plan is Planning to fail

Kristen
Test

22859 Posts

Posted - 2010-01-13 : 02:38:29

"I should really refresh before posting"

nah, three subtly different answers here, always interesting to see how different people would tackle it.

DP978: "Where b.cnt > 1" - I think that's going to pull more rows from the inner SELECT (to then discard in the outer select), and thus take more time than the HAVING approach, but I aint tested it, is the optimiser smart enough to make them the same?

RickD: "SELECT DISTINCT " - I reckon the DISTINCT is redundant, given the GROUP BY. Or am I missing something?

Madhi: A+

Shocking formatting though, were you in a hurry?

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2010-01-13 : 03:50:39

quote:
Originally posted by Kristen

"I should really refresh before posting"

nah, three subtly different answers here, always interesting to see how different people would tackle it.

DP978: "Where b.cnt > 1" - I think that's going to pull more rows from the inner SELECT (to then discard in the outer select), and thus take more time than the HAVING approach, but I aint tested it, is the optimiser smart enough to make them the same?

RickD: "SELECT DISTINCT " - I reckon the DISTINCT is redundant, given the GROUP BY. Or am I missing something?

Madhi: A+ Shocking formatting though, were you in a hurry?

I format at the Front end

Madhivanan

Failing to plan is Planning to fail

RickD
Slow But Sure Yak Herding Master

3608 Posts

Posted - 2010-01-13 : 05:36:12