Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Update Duplicate Record Issue

Author  Topic 

qjam
Starting Member

19 Posts

Posted - 2007-12-11 : 00:50:42
Hi all,

I am trying to write an update statement which will merge two employee records into one. I am trying to run an update statement on one of the many tables attached to the employee table but the statement is stopping because it results in some duplicate records.

EmpCosts
EmpNum (PK)
CostDate (PK)
TotalCost

Sample Records
22, 12/12/2005, 45.00
15, 01/11/2005, 33.00
15, 12/12/2005, 23.00

If I run:

UPDATE EmpCosts
SET EmpNum = '22'
WHERE EmpNum = '15'

I get a primary key error because employee '22' already has an EmpCost record on the 12/12/2005.

This error stops all the other records from updating. Is there an option to allow all the non-duplicate records to be updated, and skip the duplicates?

As I will have to repeat this statement with over 50 tables, the simpler the option the better.

Thanks for your help!

nathans
Aged Yak Warrior

938 Posts

Posted - 2007-12-11 : 01:15:23
Are you doing this update in a loop / procedural operation? What do you mean by "This error stops all the other records from updating."

Is this a one-time 15 --> 22 update? Or is there another table that holds the source data? If so, show us the whole picture so we can show you a set-based solution to this problem.

With the given info Ill throw this out there:


declare @EmpCosts table (EmpNum int, CostDate datetime, TotalCost decimal(10,2))
insert into @EmpCosts
select 22, '12/12/2005', '45.00' union all
select 15, '01/11/2005', '33.00' union all
select 15, '12/12/2005', '23.00'

declare @EmpNum_from int,
@EmpNum_to int

select @EmpNum_from = 15,
@EmpNum_to = 22


select * from @EmpCosts

update a
set EmpNum = @EmpNum_to
from @EmpCosts a
left
join @EmpCosts b on
b.EmpNum = @EmpNum_to and
a.CostDate = b.CostDate
where a.EmpNum = @EmpNum_from and
b.EmpNum is null

select * from @EmpCosts


Nathan Skerl
Go to Top of Page

ayamas
Aged Yak Warrior

552 Posts

Posted - 2007-12-11 : 01:21:35
Of course it will give an error
With the sample data you provided the composite primary key of
22, 12/12/2005
15, 12/12/2005
will be duplicated as you are writing the update statement to update 22 with 15.
But dates for both the records are same.So in turn there will be duplicate records.I mean the
updated records (theoratically) after u run the update statement will be
22, 12/12/2005, 45.00
22, 12/12/2005, 23.00
which are duplicates.

Regards


Go to Top of Page

qjam
Starting Member

19 Posts

Posted - 2007-12-11 : 01:57:25
Basically, what I am trying to do is merge two employee records (which have other tables such as Costs, Sales, Training, etc linked to the employee table through the the EmpNum) into one record by changing the EmpNum in each table from the old employee's EmpNum to the new employee's EmpNum. Once all the old employee's cost, sales, etc records have been 'redirected' to the new employee, I will delete old employee record.

Ideally I would like to do something like this

For each employee related table
  UPDATE tablename
  SET EmpNum = 'NewEmpNum'
  WHERE EmpNum = 'OldEmpNum'
Loop

This would be easy as all I would have to do is specify the Old and New EmpNum, as well as a list of all the employee related tables which I need to change.

However in the possibility of a duplicate record existing, as shown in my first post, the entire update statement fails. To extend the previous example, the old employee could have 50 EmpCosts records with only one duplicate, but the other 49 non-duplicate records cannot be copied because of the update statement fails. It is more than fine for the duplicate to not come across because in the vast majority of cases, it will hold exactly the same information for both employees.

What I require is some way to ignore the duplicates or the error and make the update statement continue for all the other entries which can be updated.

Thanks for your help.



Go to Top of Page

ayamas
Aged Yak Warrior

552 Posts

Posted - 2007-12-11 : 02:34:18
You can set SET XACT_ABORT OFF
which will just rollback the record where the error occurs but will continue with the rest of transaction or
modify the update statement to
UPDATE tablename t1
SET EmpNum = 'NewEmpNum'
WHERE not exists (select Empnum from tablename t2 where t2.empnum=t1.empnum)
Go to Top of Page

nathans
Aged Yak Warrior

938 Posts

Posted - 2007-12-11 : 14:27:15
instead of processing through PK violations, why not only update the records that will not throw the error?



Nathan Skerl
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2007-12-11 : 16:15:43
I'm not 100% what your goal is, but can you add a new coulmn (NewEmpID or something). Update that new column as appropriate, drop the PK and old column. Then rename the new column and recreate the index? Or even creat a whole new table, load it, drop the old table and rename the new table?
Go to Top of Page

qjam
Starting Member

19 Posts

Posted - 2007-12-13 : 00:30:18
SET XACT_ABORT OFF is close but it will rollback the entire update statement rather than the individual records which have primary key violations, and allowing the others records to be updated.

The other option I found was the IGNORE_DUP_KEY option on an index, which professes to skip any records with errors on a BULK INSERT, however I don't think you can set this on a primary key. Ideally I could temporarily set this option before running the update, then turn it off after, however I'm not sure how to do this correctly.

I would like to avoid doing anything that would involve changing the structure of the database such as dropping a table or primary key as I am unsure of the implications on other applications which use the database.

It's looking like this is not possible, instead I will have to write a where clause which only selects the non-duplicate records, however I would have to do this for each table (over 50 and growing), each with their own primary keys so the statement will have to be written specifically for that table. I suppose I'm trying to cheat!

Go to Top of Page
   

- Advertisement -