Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 CTE Duplicate Issue

Author  Topic 

JAdauto
Posting Yak Master

160 Posts

Posted - 2014-10-22 : 16:31:04
I need to seek out duplicate records in many tables, log the counts, tablename, etc and then delete the duplicates.

This code snipit repeats for about 45 tables. I tried to set my variables above, but apparently my variables (@CompanyID, etc) or even my Temp Table (#VariableData) is not available within each of my CTE groups.

Also, I was reading that ONE SELECT, INSERT, DELETE, UPDATE statement could follow the CTE. Only in this example, I need to perform both the INSERT for the log and the actual DELETE to delete the dups.


Are there any work arounds for the two things I am trying to accomplish?


GO
with x as (SELECT *, rn = row_number()
OVER(PARTITION BY ScheduleDate, FKStoreID, FKEmployeeNumber, InMinute, OutMinute, FKJobCodeID, [Hours]
ORDER BY DateTimeStamp DESC)
FROM dpvHstSchedule
WHERE ScheduleDate >= DATEADD(dd, -2, CONVERT(VARCHAR(10), GETDATE(), 101)) )

INSERT INTO DuplicateFinderLog (FKStoreID, TableName, DateofBusiness, DateTimeStamp, DuplicateCount, CompanyID, SessionID, FindDateTimeStamp)
SELECT FKStoreID, 'dpvHstSchedule',DateofBusiness, DateTimeStamp, COUNT(*), CompanyID, SessionID, FindDate
FROM x, #VariableData
WHERE rn > 1
GROUP BY FKStoreid, DateofBusiness, DateTimeStamp, CompanyID, SessionID, FindDate

DELETE FROM x WHERE rn > 1

gbritton
Master Smack Fu Yak Hacker

2780 Posts

Posted - 2014-10-22 : 16:35:49
http://weblogs.sqlteam.com/joew/archive/2007/10/23/60383.aspx
Go to Top of Page

JAdauto
Posting Yak Master

160 Posts

Posted - 2014-10-22 : 16:40:04
I have a very specific list of tables that I am needing to go through, so I cannot use msForEachTable. I am inquiring more specifically how I can LOG what I need todo and if there is a way to access other variables outside of the WITH.

If I just had to do with WITH X as ( ) and then DELETE from X, then it wouldn't be a big deal.. but I also need to log the counts in X. Which I am reading that only ONE statement is allowed following the CTE.
Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2014-10-22 : 17:03:17
You can use variables in the CTE; the GO is what's causing the variable references to fail. Instead of the GO, just add a semi-colon before the WITH:
;WITH cte_name AS ...
Please let me know the variables you need and/or include for the #variables table.

You can still use sp_MSforeachtable: you just need to code it to exit if the table name is not one you want to process. I'll code my example that way to demonstrate.

You can use the OUTPUT clause to capture the values from the INSERT statement into a table, then use that table to do the DELETEs.
Go to Top of Page

JAdauto
Posting Yak Master

160 Posts

Posted - 2014-10-23 : 09:26:17
Yes the variables are in my #Variables table. Thanks for the tip with the ;!!
Go to Top of Page

JAdauto
Posting Yak Master

160 Posts

Posted - 2014-11-05 : 22:40:37
Im back. :)

This duplicate resolution is so close, but now I am faced with performance issues. I really need this to run in 2 min or less, but Im at about 5.5min for 3 days of data. I run this similar CTE for 45 or so of our tables. Does anyone see anything that might help with performance?

with x as (SELECT *, rn = row_number()
OVER(PARTITION BY DateOfBusiness, FKStoreID, EntryID, FKEmployeeNumber, FKManagerNumber, CheckNumber, TableName, FKItemID, Price, Hour, Minute, FKReasonID
ORDER BY DateTimeStamp DESC)
FROM HstVoid
WHERE DateOfBusiness >= DATEADD(dd, -3, CONVERT(VARCHAR(10), GETDATE(), 101)) )
--SELECT * FROM x WHERE rn > 1;

DELETE FROM x
OUTPUT DELETED.FKSToreID, DELETED.DateOfBusiness, DELETED.DateTimeStamp INTO #DeletedData
WHERE rn > 1

INSERT INTO DuplicateFinderLog (FKStoreID, TableName, DateofBusiness, DateTimeStamp, DuplicateCount, CompanyID, SessionID, FindDateTimeStamp, AlertSent)
SELECT FKStoreID, 'HstVoid',DateofBusiness, DateTimeStamp, COUNT(*), @CompanyID, @SessionID, @FindTime, 0
FROM #DeletedData
GROUP BY FKStoreID, DateofBusiness, DateTimeStamp

TRUNCATE TABLE #DeletedData;
Go to Top of Page
   

- Advertisement -