Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Need assistance in finding duplicate records

Author  Topic 

btamulis
Yak Posting Veteran

64 Posts

Posted - 2008-11-12 : 16:37:41
I have a table:

CREATE TABLE [dbo].[GL30000](
[HSTYEAR] [smallint] NOT NULL,
[JRNENTRY] [int] NOT NULL,
[RCTRXSEQ] [numeric](19, 5) NOT NULL,
[SOURCDOC] [char](11) NOT NULL,
[REFRENCE] [char](31) NOT NULL,
[DSCRIPTN] [char](31) NOT NULL,
[TRXDATE] [datetime] NOT NULL,
[ACTINDX] [int] NOT NULL,
[TRXSORCE] [char](13) NOT NULL,
[NOTEINDX] [numeric](19, 5) NOT NULL,
[CRDTAMNT] [numeric](19, 5) NOT NULL,
[DEBITAMT] [numeric](19, 5) NOT NULL,
[DEX_ROW_ID] [int] IDENTITY(1,1) NOT NULL,
CONSTRAINT [PKGL30000] PRIMARY KEY NONCLUSTERED
(
[DEX_ROW_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

With 10 records..........

insert into GL30000 Values
(2008,001,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',35,'TEST001',1370,199.99,0)
insert into GL30000 Values
(2008,001,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',36,'TEST001',1370,0,199.99)
insert into GL30000 Values
(2008,001,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',35,'TEST001',1370,199.99,0)
insert into GL30000 Values
(2008,001,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',36,'TEST001',1370,0,199.99)
insert into GL30000 Values
(2008,002,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',41,'TEST001',1500,5.18,0)
insert into GL30000 Values
(2008,002,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',36,'TEST001',1500,0,5.18)
insert into GL30000 Values
(2008,003,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',114,'TEST001',875,300,0)
insert into GL30000 Values
(2008,003,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',36,'TEST001',875,0,300)
insert into GL30000 Values
(2008,003,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',98,'TEST001',875,60,0)
insert into GL30000 Values
(2008,003,1,'BBF', 'BEGINNING BALANCE', 'TEST', '2008-12-31',36,'TEST001',875,0,60)

I need to find duplicate records as defined -
same actindx, same HSTYEAR, same JRNENTRY, same debitamt, same crdtamnt

I want my query to return these values plus the dex_row_id (which is unique even for duplicate records.

This query returns everything i need except dex_row_id.

SELECT

COUNT(*),

hstyear, jrnentry, actindx, debitamt, crdtamnt, jrnentry

FROM gl30000

GROUP BY

hstyear, jrnentry, actindx, debitamt, crdtamnt, jrnentry

HAVING COUNT(*) > 1 --This filters on those ids that occurs more than once

ORDER BY COUNT(*) DESC --Order by those ids that have the most occurrences

My question is - can I change my query to return 2 records - each with the unique dex_row_id?

Any thoughts?



SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-11-12 : 16:39:10
Use the ROW_NUMBER() function.



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

btamulis
Yak Posting Veteran

64 Posts

Posted - 2008-11-12 : 17:06:11
Okay......I've never used the ROW_NUMBER () function before.

I looked at BOL and tried putting it into my query as follows:

SELECT

COUNT(*),

hstyear, jrnentry, actindx, debitamt, crdtamnt, jrnentry,
ROW_NUMBER() OVER(ORDER BY dex_row_id DESC) AS 'Row Number'

FROM gl30000

GROUP BY

hstyear, jrnentry, actindx, debitamt, crdtamnt, jrnentry, dex_row_id

HAVING COUNT(*) > 1

ORDER BY COUNT(*) DESC

Ths query returns no records.


In books on line - I intepret their syntax to act like a 'RANK' (another SQL syntax that I've never used) from BOL:

USE AdventureWorks
GO
SELECT c.FirstName, c.LastName, ROW_NUMBER() OVER(ORDER BY SalesYTD DESC) AS 'Row Number', s.SalesYTD, a.PostalCode
FROM Sales.SalesPerson s JOIN Person.Contact c on s.SalesPersonID = c.ContactID
JOIN Person.Address a ON a.AddressID = c.ContactID
WHERE TerritoryID IS NOT NULL AND SalesYTD <> 0

This sums the sales and returns them with numbered row??

What am i missing? Besides a brain?
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-11-12 : 23:30:26
for seleting unique ones use

select * from
(
select row_number() over (partition by actindx, HSTYEAR, JRNENTRY, debitamt, crdtamnt order by [DEX_ROW_ID]) AS seq,
* from GL30000
)t
where t.seq=1


and for deletion

delete t from
(
select row_number() over (partition by actindx, HSTYEAR, JRNENTRY, debitamt, crdtamnt order by [DEX_ROW_ID]) AS seq,
* from GL30000
)t
where t.seq>1
Go to Top of Page

btamulis
Yak Posting Veteran

64 Posts

Posted - 2008-11-13 : 06:42:36
visakh16 - thank you.

Your query returned 8 of the 10 rows (it left out the 2 duplicates) since I wanted just the 2 records - I changed the where t.seq=1 to be where t.seq>1 and I got my desired results (the 2 records that were duplicates by my definition.

I'm still not certain how sql is executing the query - but it works and I thank you tremendously.

Thanks to all that read and replied.....this forum is truly fantastic.

Go to Top of Page
   

- Advertisement -