Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 HELP-Complicated Fixing Duplicate Problem

Author  Topic 

corey
Starting Member

13 Posts

Posted - 2004-12-16 : 18:36:18
I have a problem...the sql below works but doesn't do everything I want it to in one statement. Below, I describe what it really needs to do.

ANY and ALL help will be appreciated greatly.

This works, but instead of always giving precedence to the sequence field to the @master_id record, I would like to give precedence to the id that has the most recent timestamp in dbo.problemtable. Where it gets MORE complicated is that the dbo.problemtable has two records that need to be updated with the same sequence (seq)…the two records are almost identical except one record has a priority of 1 and the other has a priority of 2.

So, in the WHILE loop I need to…
1. find which id has the most recent time stamp in the problemtable and give the sequence precence to that record.
2. find the matching records in the problemtable and update them with the same sequence number
3. loop through all the records in the problemtable and update the id to the @master_id
4. then start the loop over for other records in the duplisttable.

Hopefully I have explained this well enough for someone to tell me how easy it is… I have thought about using FETCH and an ARRAY for the two records in the problemtable that need the same sequence…I just can’t get it to work.



BEGIN
-- declare needed variables
declare @total as int,
@counter as int,
@dup_id as varchar(10),
@master_id as varchar(10),
@max_dup_seq as int,
@max_master_seq as int,

-- set @counter to 0 for loop
SET @counter = 0

-- set the total to the number of records in fixduplist that failed due to the index
SET @total = (SELECT count(f.dup_id)
FROM dbo.duplisttable f
WHERE f.result = '10' and left(f.dup_id,1) = 'A')


-- start loop for all identified id’s
WHILE @counter < @total
BEGIN

SELECT @dup_id=f1.dup_id, @master_id=f1.master_id
FROM dbo.duplisttable f1
WHERE f1.result = '10' and left(f1.dup_id,1) = 'A'

UNION
SELECT f2.dup_id, f2.master_id
FROM dbo.fixdup f2
WHERE f2.result = '10' and left(f2.dup_id,1) = 'A' and
f2.dup_id = @dup_id and f2.master_id = @master_id


UNION
SELECT f2.dup_id, f2.master_id
FROM dbo. duplisttable f2
WHERE f2.result = '10' and left(f2.dup_id,1) = 'A' and
f2.dup_id = @dup_id and f2.master_id = @master_id


-- set @max_dup_seq variable to the max sequence number
SELECT @max_dup_seq=max(p.seq)
FROM dbo.problemtable p
WHERE p.id = @dup_id

SELECT @max_master_seq=max(p.seq)
FROM dbo.problemtable p
WHERE p.id = @master_id


-- this should never be null, if null then no update is needed
if (@max_dup_seq) is not null

BEGIN
-- we can go ahead and update
UPDATE dbo.problemtable
SET seq = seq + @max_dup_seq
WHERE id=@master_id

UPDATE dbo.problemtable
SET id = @master_id
WHERE id = @dup_id

END

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-12-17 : 07:32:33
Could you show us what the data looks like before you run this code, and what you want it to look like after...? I think it would be much easier for us to address the process with a better understanding of what you are working with.

Corey
Go to Top of Page

corey
Starting Member

13 Posts

Posted - 2004-12-17 : 11:50:59
Sure...and thank you for taking the time to really look at this code.

A000082 Records in ProblemTable need ID A000081, giving priority to A000082 to get highest sequence.

A000100 Records in ProblemTable need ID A000001, giving priority to A000001 to get highest sequence.

It DOES NOT MATTER if there are skips in the sequence field for records, just that the most recent based on timestamp gets the highest sequence number.

Once again Thank you and sorry...the formatting isn't quite working that well.

--FixDupList Table Would look much like this...(3 Records)
MasterID DupID MasterDesc DupDesc User DateSubmit UserSubmit Stamp Result
A000081 A000082 MasterDescription DupDescription Corey 12/16/2004 6:44:35 PM Corey 12/16/2004 6:44:43 PM 10
A000081 A000083 MasterDescription DupDescription Corey 12/16/2004 6:44:35 PM Corey 12/16/2004 6:44:43 PM 10
A000001 A000100 MasterDescription DupDescription Corey 12/16/2004 6:44:35 PM Corey 12/16/2004 6:44:43 PM 10


--The ProblemTable Might Look like This
ID Seq Info Priority UserID TimeStamp
A000081 1 <Binary> Detail corey 12/16/2004 6:41:06 PM
A000081 1 <Binary> Abbrev corey 12/16/2004 6:41:07 PM
A000082 1 <Binary> Detail corey 12/16/2004 6:44:11 PM
A000082 1 <Binary> Abbrev corey 12/16/2004 6:44:11 PM
A000001 2 <Binary> Detail corey 9/11/2003 11:01:35 AM
A000001 2 <Binary> Abbrev corey 9/11/2003 11:01:35 AM
A000100 1 <Binary> Detail corey 3/21/2003 2:29:59 PM
A000100 1 <Binary> Abbrev corey 3/21/2003 2:29:59 PM
A000100 2 <Binary> Detail corey 3/27/2003 4:32:53 PM
A000100 2 <Binary> Abbrev corey 3/27/2003 4:32:54 PM

Go to Top of Page

corey
Starting Member

13 Posts

Posted - 2004-12-17 : 12:06:58
After ...

As I stated...sequence number in problem table doesn't matter and doesn't have to fall in line with timestamp as long as most recent time stamp in records that update ID to MasterID from fixduplist get highest sequence number. And...I even though the stamps in the problem table with the same sequence have the same timestamp(they are often off very slightly), I cannot rely on that for the update hence a union / array or some sort?

--The ProblemTable Might Look like This
ID Seq Info Priority UserID TimeStamp
A000081 1 <Binary> Detail corey 12/16/2004 6:41:06 PM
A000081 1 <Binary> Abbrev corey 12/16/2004 6:41:07 PM
A000081 2 <Binary> Detail corey 12/16/2004 6:44:11 PM
A000081 2 <Binary> Abbrev corey 12/16/2004 6:44:11 PM
A000001 4 <Binary> Detail corey 9/11/2003 11:01:35 AM
A000001 4 <Binary> Abbrev corey 9/11/2003 11:01:35 AM
A000001 3 <Binary> Detail corey 3/21/2003 2:29:59 PM
A000001 3 <Binary> Abbrev corey 3/21/2003 2:29:59 PM
A000001 2 <Binary> Detail corey 3/27/2003 4:32:53 PM
A000001 2 <Binary> Abbrev corey 3/27/2003 4:32:54 PM
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2004-12-23 : 12:05:16
Is this what you are looking for?


Declare @myTable table (id varchar(10), Seq int, Priority varchar(10), UserId varchar(10), TimeStamp datetime)
Insert Into @myTable
Select 'A000081', 1, 'Detail', 'corey', '12/16/2004 6:41:06 PM'
Union Select 'A000081', 1, 'Abbrev', 'corey', '12/16/2004 6:41:07 PM'
Union Select 'A000081', 1, 'Detail', 'corey', '12/16/2004 6:44:11 PM'
Union Select 'A000081', 1, 'Abbrev', 'corey', '12/16/2004 6:44:11 PM'
Union Select 'A000001', 1, 'Detail', 'corey', '9/11/2003 11:01:35 AM'
Union Select 'A000001', 1, 'Abbrev', 'corey', '9/11/2003 11:01:35 AM'
Union Select 'A000001', 1, 'Detail', 'corey', '3/21/2003 2:29:59 PM'
Union Select 'A000001', 1, 'Abbrev', 'corey', '3/21/2003 2:29:59 PM'
Union Select 'A000001', 1, 'Detail', 'corey', '3/27/2003 4:32:53 PM'
Union Select 'A000001', 1, 'Abbrev', 'corey', '3/27/2003 4:32:54 PM'

Select
id,
Seq,
NewSeq = (Select count(*) From @myTable Where id = A.id and Priority = A.Priority and TimeStamp <= A.TimeStamp),
Priority,
UserId,
TimeStamp
From @myTable A
Order By id, NewSeq Desc, Priority


Corey
Go to Top of Page

corey
Starting Member

13 Posts

Posted - 2005-03-08 : 16:20:54
I am sorry to bring this up again out of no where again but it seems as though the sql is needed and I am back at it looking for a workable solution in sql.

I am grabing information from one table that is designed to fix duplicate id's. The fix fails if each id has an image in the database. My goal is to search all the images for both the duplicate and the correct id in the image table and re-sequence them all to the correct id the user submitted via the front end. Here are few catchs...
1. Each image has a detail and a thumb image in the image table. They both have the same image sequence (not necc the same stamp)
2. I would like to search for the most recent image taken and assign both the thumb and detail image to the correct id and give them the highest sequence. They must have the same sequence number.

SUBMITTED DUPLICATE TABLE
correct_id duplicate_id
001 002


IMAGE TABLE
image_id image_seq image_info image_data stamp
002 1 Detail <binary> 1/1/01 13:32:02
002 1 Thumb <binary> 1/1/01 13:32:04
001 1 Detail <binary> 1/1/01 19:45:02
001 1 Thumb <binary> 1/1/01 19:45:03
001 2 Detail <binary> 2/23/04 09:59:17
001 2 Thumb <binary> 2/23/04 09:59:20
002 2 Detail <binary> 2/23/04 14:20:36
002 2 Thumb <binary> 2/23/04 14:20:36


My results should look like this...

IMAGE TABLE
image_id image_seq image_info image_data stamp
001 1 Detail <binary> 1/1/01 13:32:02
001 1 Thumb <binary> 1/1/01 13:32:04
001 2 Detail <binary> 1/1/01 19:45:02
001 2 Thumb <binary> 1/1/01 19:45:03
001 3 Detail <binary> 2/23/04 09:59:17
001 3 Thumb <binary> 2/23/04 09:59:20
001 4 Detail <binary> 2/23/04 14:20:36
001 4 Thumb <binary> 2/23/04 14:20:36


Notice how the record with the latest stamp has the highest image_seq for both the Detail and Thumb. I could really care less about the other images being in the right order. The application I am working with only really cares about the highest sequence and that is the image it uses.

Any help would REALLY REALLY be appreciated.
Go to Top of Page
   

- Advertisement -