Duplicate Employee Rows - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Duplicate Employee Rows

Author

Topic

Looper
Yak Posting Veteran

68 Posts

Posted - 2007-11-26 : 08:54:24

I am trying to structure a stored procedure to do the following:

I have an employee table and an employment table. Employees are linked to employments by employeeId from employee table. Employees can have more than one employment.

The problem is that we have duplicate employee records within Employee table.

I need to write a script that will determine which employee record has the highest Employee Number. For each of the other duplicate records change the employments to reflect the employeeId of the record with the highest Employee Number. Then remove the employee record(s) with the lower Employee Number(s).

Can anyone suggest a good way of doing this.

RickD
Slow But Sure Yak Herding Master

3608 Posts

Posted - 2007-11-26 : 12:42:50

[code]
select EmployeeNo, count(*)
from employee
group by EmployeeNo
having count(*) > 1
[/code]
This will give you employees with more than 1 record.

TG
Master Smack Fu Yak Hacker

6065 Posts

Posted - 2007-11-26 : 16:20:04

Looper, are you saying that you have multiple "John Smith"s with different employeeNos - or as RickD suggests, you somehow have duplicate employeeNos?

A more typical problem is the former. If that is your case then you first need to decide what columns constitute a "duplicate" row. Let's assume for a minute that it is [FirstName] + [LastName]. Of course a large database is likely to have 2 different people with the same LastName and FirstName so that is probably not a good candidate key. But for the sake of argument if it was a good cadidate for a logical unique key and both firstname and lastname do not allow nulls - you could:


THIS CODE IS UNTESTED - plus you don't want to assume that the same firstname+lastname is the same person.

--first update employement to the max employeeid for each person
update emp set
	emp.employeeid = d.MaxEmployeeID
from	(--get dupe employeess
	select isNull(Firstname,'') + isNull(LastName,''), max(employeeid) as MaxEmployeeid
	from	employee
	group by isNull(Firstname,'') + isNull(LastName,'')
	having count(*) > 1
	) as d
join	employee e
		on emp.firstname = d.firstname 
		and emp.lastname = d.lastname
join	employment emp 
		on emp.firstname = e.firstname 
		and emp.lastname = e.lastname
		and emp.Employeeid < e.employeeID

--get rid of the dupes from employee
delete e
from	(--get dupe employeess
	select isNull(Firstname,'') + isNull(LastName,''), max(employeeid) as MaxEmployeeid
	from	employee
	group by isNull(Firstname,'') + isNull(LastName,'')
	having count(*) > 1
	) as d
join	employee e 
		on e.firstname = d.firstname 
		and e.lastname = d.lastname
		and e.Employeeid < d.MaxEmployeeID

Be One with the Optimizer
TG

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources