Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Finding duplicates

Author  Topic 

Apples
Posting Yak Master

146 Posts

Posted - 2009-01-19 : 10:15:52
Here is my schema and some data:

------------------------------------------------------
tblOrders
------------------------------------------------------
OrderID | Email | OrderDate
------------------------------------------------------
1 | test1@site.com | 1/30/2008 9:43:34 AM
2 | test2@site.com | 2/12/2008 10:07:00 AM
3 | test2@site.com | 3/19/2008 3:38:00 PM
4 | test3@site.com | 4/2/2008 3:52:49 PM
5 | test3@site.com | 5/28/2008 5:10:14 PM
6 | test3@site.com | 6/5/2008 9:28:35 AM
------------------------------------------------------

I am trying to find duplicate orders; I'm finding the duplicates by checking the Email field of the table.

The only thing is, I DO NOT want to return the first occurrence from an Email address, only the orders after. So if I run my query, it should return:

------------------------------------------------------
OrderID | Email | OrderDate
------------------------------------------------------
3 | test2@site.com | 3/19/2008 3:38:00 PM
5 | test3@site.com | 5/28/2008 5:10:14 PM
6 | test3@site.com | 6/5/2008 9:28:35 AM
------------------------------------------------------

My query right now will return duplicates, but it also returns the first occurrence, so I need some help with it. Here is my stored procedure:

@StartDate datetime=null,
@EndDate datetime=null
AS
BEGIN

SELECT *
FROM tblOrders
WHERE
(@StartDate IS NULL OR OrderDate >= @StartDate) AND
(@EndDate IS NULL OR OrderDate <= @EndDate) AND
Email IN
(
SELECT Email
FROM tblOrders
GROUP BY Email
HAVING COUNT(*) > 1
)
ORDER BY OrderDate ASC

As you can see, a date range can be specified; this is working for me as well, I just need help with eliminating the first occurrence.

raky
Aged Yak Warrior

767 Posts

Posted - 2009-01-19 : 10:28:07
try this

@StartDate datetime=null,
@EndDate datetime=null
AS
BEGIN

SELECT *
FROM ( select orderid,email,orderdate,row_number() over( partition by Email order by orderdate) as sno
from tblOrders
WHERE
(@StartDate IS NULL OR OrderDate >= @StartDate) AND
(@EndDate IS NULL OR OrderDate <= @EndDate) ) c
where c.sno <> 1
Go to Top of Page

Apples
Posting Yak Master

146 Posts

Posted - 2009-01-19 : 10:39:26
quote:
Originally posted by raky

try this

@StartDate datetime=null,
@EndDate datetime=null
AS
BEGIN

SELECT *
FROM ( select orderid,email,orderdate,row_number() over( partition by Email order by orderdate) as sno
from tblOrders
WHERE
(@StartDate IS NULL OR OrderDate >= @StartDate) AND
(@EndDate IS NULL OR OrderDate <= @EndDate) ) c
where c.sno <> 1



Thanks for the reply.

Unfortunately, I can't use row_number(), I am using SQL Server 2000.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2009-01-19 : 10:46:48
here's a sql 2000 compatible solution
...
@StartDate datetime=null,
@EndDate datetime=null
AS
BEGIN

select t.orderid,t.email,t.orderdate
from tblOrders t
LEFT JOIN (SELECT email,MIN(orderdate) AS first
from tblOrders
GROUP BY email) tmp
ON tmp.first=t.orderdate
AND tmp.email=t.email
WHERE
(@StartDate IS NULL OR t.OrderDate >= @StartDate) AND
(@EndDate IS NULL OR t.OrderDate <= @EndDate)
AND tmp.email IS NULL

Go to Top of Page
   

- Advertisement -