Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 SQL Stored Procedure

Author  Topic 

Dhanalakshmi
Starting Member

12 Posts

Posted - 2011-08-18 : 05:20:53
Hi All,

How to compare records in one data table which contains different primary key value ,but all the other fields are same using sql stored procedure?

My sample table which contains duplicates and unique records:

jobid titleid JobDate
324773365 1507 2011-05-25 00:00:00.000
324773380 1505 2011-07-01 00:00:00.000
324773368 389 2011-07-01 00:00:00.000
324773366 1507 2011-07-28 00:00:00.000
324773362 1505 2011-08-01 00:00:00.000
324773367 389 2011-08-01 00:00:00.000

I have to update the duplicate records by checking TitleID and StateID,lenOfTitle of records based on jobdate.If Jobdate is greater than 60 days then consider the duplicate record as a new one.otherwise mark it is as a duplicate record.

Output like this:

jobid isduplicate duplicateof1
324773362 1 324773380
324773365 0 0
324773366 0 0
324773367 1 324773368
324773368 0 0
324773380 0 0


IsDuplicate is set to 1 if the record is duplicate of other record,otherwise 0.

Can you please help me?

Thanks,
DIP.

yadhu_cse
Constraint Violating Yak Guru

252 Posts

Posted - 2011-08-18 : 05:32:14
use this to find duplicate record
SELECT sid,
COUNT(sid) AS NumOccurrences
FROM users
GROUP BY sid
HAVING ( COUNT(sid) > 1 )
Go to Top of Page

Dhanalakshmi
Starting Member

12 Posts

Posted - 2011-08-18 : 06:15:54
quote:
Originally posted by yadhu_cse

use this to find duplicate record
SELECT sid,
COUNT(sid) AS NumOccurrences
FROM users
GROUP BY sid
HAVING ( COUNT(sid) > 1 )



Hi,
Thank you for your response.
I have tried it already ,it didn't work.

Thanks,
DIP.
Go to Top of Page

yadhu_cse
Constraint Violating Yak Guru

252 Posts

Posted - 2011-08-18 : 06:47:30
try that into cte and use update statement
Go to Top of Page

Ifor
Aged Yak Warrior

700 Posts

Posted - 2011-08-18 : 06:58:49
[code]
-- *** Test Data ***
CREATE TABLE #t
(
jobid int NOT NULL
,titleid int NOT NULL
,JobDate datetime NOT NULL
);
INSERT INTO #t
VALUES
(324773365, 1507, '20110525')
,(324773380, 1505, '20110701')
,(324773368, 389, '20110701')
,(324773366, 1507, '20110728')
,(324773362, 1505, '20110801')
,(324773367, 389, '20110801');
-- *** End Test Data ***

WITH Duplicates
AS
(
SELECT T2.jobid, MAX(T1.jobid) AS duplicateof
FROM #t T1
JOIN #t T2
ON T1.titleid = T2.titleid
-- these columns are not in the test data
--AND T1.stateID = T2.stateID
--AND T1.lenOfTitle = T2.lenOfTitle
AND T2.JobDate > T1.JobDate
AND T2.JobDate <= DATEADD(d, 60, T1.JobDate)
GROUP BY T2.jobid
)
SELECT T.jobid
,CASE WHEN D.jobid IS NULL THEN 0 ELSE 1 END AS isduplicate
,COALESCE(D.duplicateof, 0) AS duplicateof
FROM #t T
LEFT JOIN Duplicates D
ON T.jobid = D.jobid
ORDER BY jobid;
[/code]
Go to Top of Page

Dhanalakshmi
Starting Member

12 Posts

Posted - 2011-08-18 : 09:53:24
quote:
Originally posted by Ifor


-- *** Test Data ***
CREATE TABLE #t
(
jobid int NOT NULL
,titleid int NOT NULL
,JobDate datetime NOT NULL
);
INSERT INTO #t
VALUES
(324773365, 1507, '20110525')
,(324773380, 1505, '20110701')
,(324773368, 389, '20110701')
,(324773366, 1507, '20110728')
,(324773362, 1505, '20110801')
,(324773367, 389, '20110801');
-- *** End Test Data ***

WITH Duplicates
AS
(
SELECT T2.jobid, MAX(T1.jobid) AS duplicateof
FROM #t T1
JOIN #t T2
ON T1.titleid = T2.titleid
-- these columns are not in the test data
--AND T1.stateID = T2.stateID
--AND T1.lenOfTitle = T2.lenOfTitle
AND T2.JobDate > T1.JobDate
AND T2.JobDate <= DATEADD(d, 60, T1.JobDate)
GROUP BY T2.jobid
)
SELECT T.jobid
,CASE WHEN D.jobid IS NULL THEN 0 ELSE 1 END AS isduplicate
,COALESCE(D.duplicateof, 0) AS duplicateof
FROM #t T
LEFT JOIN Duplicates D
ON T.jobid = D.jobid
ORDER BY jobid;





Hi,
Thanks a lot for your response.

Thanks,
DIP.
Go to Top of Page

Dhanalakshmi
Starting Member

12 Posts

Posted - 2011-09-05 : 02:28:37
Hi,

This sql statement will not worked if I gave T1.JobDate >=T2.JobDate.
How can update this statement,if both the job dates same.

Can you please help me..?

Thanks,
DIP.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-09-05 : 02:48:03
what about this?

select t.jobid,
case when t1.maxid is null then 0 else 1 end as isduplicate,
coalesce(t1.maxid,0) as duplicateof1
from table t
left join (select stateid,titleid,lenoftitle,min(jobid) as minid,max(Jobid) as maxid
from table
group by stateid,titleid,lenoftitle) t1
on t1.stateid = t.stateid
and t1.titleid = t.titleid
and t1.lenoftitle = t.lenoftitle



------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-09-05 : 03:34:28
[code]DECLARE @Sample TABLE
(
JobID INT NOT NULL,
TitleID INT NOT NULL,
JobDate DATE NOT NULL
)

INSERT @Sample
(
JobID,
TitleID,
JobDate
)
VALUES (324773365, 1507, '2011-05-25'),
(324773380, 1505, '2011-07-01'),
(324773368, 389, '2011-07-01'),
(324773366, 1507, '2011-07-28'),
(324773362, 1505, '2011-08-01'),
(324773367, 389, '2011-08-01')

-- Solution by SwePeso
SELECT s.JobID,
CASE
WHEN f.Jobs IS NULL THEN 0
ELSE 1
END AS IsDuplicate,
ISNULL(STUFF(f.Jobs, 1, 2, ''), '0') AS DuplicateOf
FROM @Sample AS s
OUTER APPLY (
SELECT ', ' + CAST(x.JobID AS VARCHAR(12))
FROM @Sample AS x
WHERE x.TitleID = s.TitleID
AND x.JobDate BETWEEN DATEADD(DAY, -60, s.JobDate) AND DATEADD(DAY, 60, s.JobDate)
AND x.JobID > s.JobID
FOR XML PATH('')
) AS f(Jobs)
ORDER BY s.JobID[/code]


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Dhanalakshmi
Starting Member

12 Posts

Posted - 2011-09-20 : 03:06:02
quote:
Originally posted by Dhanalakshmi

quote:
Originally posted by Ifor


-- *** Test Data ***
CREATE TABLE #t
(
jobid int NOT NULL
,titleid int NOT NULL
,JobDate datetime NOT NULL
);
INSERT INTO #t
VALUES
(324773365, 1507, '20110525')
,(324773380, 1505, '20110701')
,(324773368, 389, '20110701')
,(324773366, 1507, '20110728')
,(324773362, 1505, '20110801')
,(324773367, 389, '20110801');
-- *** End Test Data ***

WITH Duplicates
AS
(
SELECT T2.jobid, MAX(T1.jobid) AS duplicateof
FROM #t T1
JOIN #t T2
ON T1.titleid = T2.titleid
-- these columns are not in the test data
--AND T1.stateID = T2.stateID
--AND T1.lenOfTitle = T2.lenOfTitle
AND T2.JobDate > T1.JobDate
AND T2.JobDate <= DATEADD(d, 60, T1.JobDate)
GROUP BY T2.jobid
)
SELECT T.jobid
,CASE WHEN D.jobid IS NULL THEN 0 ELSE 1 END AS isduplicate
,COALESCE(D.duplicateof, 0) AS duplicateof
FROM #t T
LEFT JOIN Duplicates D
ON T.jobid = D.jobid
ORDER BY jobid;





Hi,
Thanks a lot for your response.

Thanks,
DIP.

Go to Top of Page

Dhanalakshmi
Starting Member

12 Posts

Posted - 2011-09-20 : 03:10:23
quote:
Originally posted by Dhanalakshmi

quote:
Originally posted by Ifor


-- *** Test Data ***
CREATE TABLE #t
(
jobid int NOT NULL
,titleid int NOT NULL
,JobDate datetime NOT NULL
);
INSERT INTO #t
VALUES
(324773365, 1507, '20110525')
,(324773380, 1505, '20110701')
,(324773368, 389, '20110701')
,(324773366, 1507, '20110728')
,(324773362, 1505, '20110801')
,(324773367, 389, '20110801');
-- *** End Test Data ***

WITH Duplicates
AS
(
SELECT T2.jobid, MAX(T1.jobid) AS duplicateof
FROM #t T1
JOIN #t T2
ON T1.titleid = T2.titleid
-- these columns are not in the test data
--AND T1.stateID = T2.stateID
--AND T1.lenOfTitle = T2.lenOfTitle
AND T2.JobDate > T1.JobDate
AND T2.JobDate <= DATEADD(d, 60, T1.JobDate)
GROUP BY T2.jobid
)
SELECT T.jobid
,CASE WHEN D.jobid IS NULL THEN 0 ELSE 1 END AS isduplicate
,COALESCE(D.duplicateof, 0) AS duplicateof
FROM #t T
LEFT JOIN Duplicates D
ON T.jobid = D.jobid
ORDER BY jobid;





Hi,
Thanks a lot for your response.

Thanks,
DIP.



Hi,

This sql statement is found the duplicate record if the different jobid's have different JobDate.
How to find duplicate record if jobid is different but the jobdate are same?
In which place the changes should be made in the previous sql statement?


Thanks,
DIP.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-09-20 : 03:14:38
you need to group by jobdate and take min() or max() of jobid in that case and join to it

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page
   

- Advertisement -