Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Finding Duplicates.. sort of

Author  Topic 

ShooterJ07
Starting Member

17 Posts

Posted - 2010-01-12 : 09:26:40
For the sake of simplicity, I'll use a generic example.

I have one table - call it Students. Its fields are:
- StudentID (Unique, primary key)
- FirstName
- LastName
- DOB


I need to find the students that have the same FirstName, LastName, and DOB, but different StudentID's. Since StudentID is unique, I think I can use this:

SELECT DISTINCT(LastName), FirstName, DOB FROM Students GROUP BY Lastname, FirstName, DOB HAVING COUNT(*) > 1


However, I need StudentID to be included in my results. I know this should be very simple, but could anyone point me in the right direction?

--------------------------

Example Table:

StudentID, FirstName, LastName, DOB
1,John,Doe,1/1/2010
2,John,Doe,1/1/2010
3,John,Doe,5/5/2005
4,Jane,Smith,1/1/2010



Query should return:

StudentID, FirstName, LastName, DOB
1,John,Doe,1/1/2010
2,John,Doe,1/1/2010

DP978
Constraint Violating Yak Guru

269 Posts

Posted - 2010-01-12 : 09:44:35
[code]Declare @Students table (StudentID Int, FirstName Varchar(20), LastName Varchar(20), DOB Datetime)
Insert @Students

Select 1,'John','Doe','1/1/2010' Union ALL
Select 2,'John','Doe','1/1/2010' Union ALL
Select 3,'John','Doe','5/5/2005' Union ALL
Select 4,'Jane','Smith','1/1/2010'

Select a.* from @Students a Inner Join
( Select FirstName, LastName, DOB, Count(*) as cnt
from @Students
Group by
FirstName, LastName, DOB
) b
ON a.firstname = b.firstname and a.LastName = b.LastName and a.DOB = b.DOB
Where b.cnt > 1[/code]
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2010-01-12 : 09:44:43
select t1.* from Students as t1
inner join
(
SELECT LastName, FirstName, DOB FROM Students
GROUP BY Lastname, FirstName, DOB HAVING COUNT(*) > 1
) as t2
on t1.LastName=t2.LastName and t1.FirstName=t2.FirstName and t1.DOB=t2.DOB


Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

RickD
Slow But Sure Yak Herding Master

3608 Posts

Posted - 2010-01-12 : 09:46:39
[code]
SELECT b.StudentId, a.LastName, a.FirstName, a.DOB
FROM (
SELECT DISTINCT LastName, FirstName, DOB
FROM Students GROUP BY Lastname, FirstName, DOB HAVING COUNT(*) > 1) a
Join Students b
ON b.LastName = a.LastName
AND b.FirstName = a.FirstName
AND b.DOB = a.DOB
[/code]
Bah, I should really refresh before posting..
Go to Top of Page

DP978
Constraint Violating Yak Guru

269 Posts

Posted - 2010-01-12 : 09:59:39
Rick, I do the same thing all the time (I'm usually beat by the regulars)! and grats Madi on 18,500 posts! thats crazy... heh
Go to Top of Page

ShooterJ07
Starting Member

17 Posts

Posted - 2010-01-12 : 10:15:05
Makes complete sense. Thanks everyone.
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2010-01-13 : 01:07:12
quote:
Originally posted by DP978

Rick, I do the same thing all the time (I'm usually beat by the regulars)! and grats Madi on 18,500 posts! thats crazy... heh


. This time you were faster

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-01-13 : 02:38:29
"I should really refresh before posting"

nah, three subtly different answers here, always interesting to see how different people would tackle it.

DP978: "Where b.cnt > 1" - I think that's going to pull more rows from the inner SELECT (to then discard in the outer select), and thus take more time than the HAVING approach, but I aint tested it, is the optimiser smart enough to make them the same?

RickD: "SELECT DISTINCT " - I reckon the DISTINCT is redundant, given the GROUP BY. Or am I missing something?

Madhi: A+ Shocking formatting though, were you in a hurry?
Go to Top of Page

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2010-01-13 : 03:50:39
quote:
Originally posted by Kristen

"I should really refresh before posting"

nah, three subtly different answers here, always interesting to see how different people would tackle it.

DP978: "Where b.cnt > 1" - I think that's going to pull more rows from the inner SELECT (to then discard in the outer select), and thus take more time than the HAVING approach, but I aint tested it, is the optimiser smart enough to make them the same?

RickD: "SELECT DISTINCT " - I reckon the DISTINCT is redundant, given the GROUP BY. Or am I missing something?

Madhi: A+ Shocking formatting though, were you in a hurry?


I format at the Front end

Madhivanan

Failing to plan is Planning to fail
Go to Top of Page

RickD
Slow But Sure Yak Herding Master

3608 Posts

Posted - 2010-01-13 : 05:36:12
quote:
Originally posted by Kristen

RickD: "SELECT DISTINCT " - I reckon the DISTINCT is redundant, given the GROUP BY. Or am I missing something?


Nah, its not needed,,
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-01-13 : 08:13:03
"I format at the Front end "

Good answer. Top of class!!
Go to Top of Page
   

- Advertisement -