Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Peter Dutch
Posting Yak Master
127 Posts |
Posted - 2003-10-23 : 09:15:36
|
| [code]CREATE TABLE ClockTimes ( Id INT IDENTITY(1,1), Badge int, ClockDate datetime, InOut CHAR(1))SET DATEFORMAT ymdINSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1010, '2003-10-23 11:00:00','I')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1010, '2003-10-23 11:01:00','I')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1500, '2003-10-23 11:00:00','O')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1610, '2003-10-23 11:15:00','O')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1712, '2003-10-23 13:15:00','I')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1712, '2003-10-23 13:15:00','O')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1813, '2003-10-23 11:00:00','O')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1912, '2003-10-23 11:15:00','O')INSERT ClockTimes (Badge, Clockdate, InOut) VALUES (1010, '2003-10-23 15:15:00','O')[/code]The above table is a simplified version of a table we have which registers work hours of our employees. It sometimes happens employees clock accidentally twice. I need to remove these 'duplicate records'. A duplicate record is defined as the same badgenumber clocks within a timeperiod of two minutes. In or out is not relevant. In the above sample data the first two records with bagde 1010 are duplicate (the last 1010 record isn't), and the records with badgenumber 1712 are duplicate.When I find a 'duplicate record' I need to remove the first record.I've been playing with this for a while now, and can't find a solution. Deleting duplicate records is easy, but that's not exactly what I need here.Anyone have some advice?Thanks in advance |
|
|
mohdowais
Sheikh of Yak Knowledge
1456 Posts |
Posted - 2003-10-23 : 10:29:20
|
Ok, I spent quite some time on this one, and the following solution seems to work *but* only with one duplicate. More than one duplicate and you will end up deleting the most recent records, but if you have only one duplicate it'll delete the oldest one.delete zfrom clocktimes z inner join( select id, badge, clockdate, inout, (select count(b.clockdate) from clocktimes b where a.badge = b.badge and a.id <> b.id and a.inout = b.inout and datediff(mi, a.clockdate, b.clockdate) between 0 and 2) as dups from clocktimes a) subqueryon z.id = subquery.idwhere subquery.dups = 1 If you run just the subquery, it will return the number of duplicates (within 2 minutes of that record), and the delete uses this subquery to delete all those records that have a duplicates. But you will see the problem with more than one duplicate - the first record will accurately show 2 duplicates, however, the other two records will show one duplicate each as well.Somebody else here could improvise upon this.Owais Make it idiot proof and someone will make a better idiot |
 |
|
|
Peter Dutch
Posting Yak Master
127 Posts |
Posted - 2003-10-24 : 03:11:55
|
| Woohoo!This seems to works great, and multiple duplicates is not really a problem for me.Thank you very much! |
 |
|
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2003-10-24 : 10:43:36
|
| I presume however....you should be implementing 'specific rules' to cater for identifying the true duplicate...i.e...if persona clocks in @ 8.10am and @ 8.12am....are you legally obliged to always ensure you pay him/her from the 8.10am entry and not randomly from the 8.12am time?Likewise with the clock-out time....You've got a technological problem....but beware of the business/legal implications of any solution. |
 |
|
|
|
|
|
|
|