Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Removing Duplicate Records from Select Statement

Author  Topic 

bwol
Starting Member

19 Posts

Posted - 2009-07-06 : 20:01:14
Good Evening,

The stored procedure I have below combines data from three tables, using two inner join statements.

It works fine, except I want to make one change I really can't figure out.

If there are two rows in the reports table that have the same employerID, reportmonth and reportyear, I only want the row with the highest ReportID value to be included in the table created by this stored procedure.

I've played around with this for about an hour and can't figure it out. The best idea I had was to add the following code directly before the second left join (which didn't work):


AND r.ReportID = (SELECT MAX (ReportID) From reports Where reports.EmployerID=e.EmployerID AND reports.ReportMonth=@ReportMonth AND reports.ReportYear = @ReportYear AND reports.reporttype='Monthly Dues Report')


Here is the procedure. Please help. Thanks!




@ReportMonth varchar(20),
@ReportYear char(4)

AS

SELECT e.EmployerName, CONVERT(varchar(10),MAX(r.DateReceived),101) AS 'Date Report Received', r.DuesAmount as 'Reported Dues Amount',

CONVERT(varchar(10),MAX(d.DateReceived),101) AS 'Date Check Received', d.DuesAmount as 'Check Amount', d.Notes as Notes

FROM employers e

LEFT JOIN

(Select * FROM reports WHERE reports.reporttype='Monthly Dues Report') As r

ON r.EmployerID=e.EmployerID AND r.ReportMonth=@ReportMonth AND r.ReportYear = @ReportYear

LEFT JOIN

(Select * From DuesChecks) As d

ON d.EmployerID=e.EmployerID AND r.ReportMonth = @ReportMonth AND r.ReportYear = @ReportYear

GROUP BY e.EmployerName, r.DuesAmount, d.DuesAmount, d.Notes

bklr
Master Smack Fu Yak Hacker

1693 Posts

Posted - 2009-07-07 : 04:44:12
[code]
once try this
SELECT e.EmployerName,
CONVERT(varchar(10),MAX(r.DateReceived),101) AS 'Date Report Received', r.DuesAmount as 'Reported Dues Amount',
CONVERT(varchar(10),MAX(d.DateReceived),101) AS 'Date Check Received', d.DuesAmount as 'Check Amount',
d.Notes as Notes
FROM employers e
LEFT JOIN
reports As r ON r.reporttype='Monthly Dues Report' AND
r.EmployerID=e.EmployerID AND r.ReportMonth=@ReportMonth AND r.ReportYear = @ReportYear
INNER JOIN
(Select MAX(reportid),employerid,reportmonth,reportyear
FROM reports WHERE reporttype='Monthly Dues Report' GROUP BY employerid,reportmonth,reportyear) As r1
ON r1.EmployerID=r.EmployerID AND r1.ReportMonth=@ReportMonth AND r1.ReportYear = @ReportYear
LEFT JOIN
DuesChecks As d ON d.EmployerID=e.EmployerID AND r.ReportMonth = @ReportMonth AND r.ReportYear = @ReportYear
GROUP BY e.EmployerName, r.DuesAmount, d.DuesAmount, d.Notes
[/code]
Go to Top of Page

bwol
Starting Member

19 Posts

Posted - 2009-07-07 : 13:24:18
Thanks for your suggest bklr.

Unfortunately it did not solve my problem.

First off, it generated the following error: "No column name was specified for column 1 of 'r1'". However, I was able to fix that error.

The main problem is that with the change you proposed, employers that did not have a row in the reports area for the time period list were not in the table generated by this procedure, and duplicate entries for certain employers were still included.

I have included some information below that hopefully will help clarfiy what I am trying to accomplish.

Three Tables Accessed by Stored Procedure

[dbo].[DuesChecks]
[DuesCheckID] [int],
[EmployerID] [int],
[DuesMonth] [varchar],
[DuesYear] [char],
[Notes] [varchar],
[DateReceived],
[Status] [varchar],
[ReportID] [int],
[DuesAmount] [decimal]

[dbo].[Reports]
[ReportID] [int],
[EmployerID] [int],
[ReportType] [varchar],
[DateReceived] [datetime],
[ReportMonth] [varchar],
[ReportMedia] [varchar],
[ExcelFileName] [varchar],
[SenderID] [int],
[ExcelSheetName] [varchar],
[DuesAmount] [decimal],
[ReportYear] [char]


[dbo].[Employers]
[EmployerID] [int],
[EmployerName] [varchar],
[Address1] [varchar],
[Address2] [varchar],
[City] [varchar],
[State] [varchar],
[Zip] [varchar],
[MRQR_ContactID],
[DuesReportContactID],
[Notes]

Purpose of Stored Procedure

With this stored procedure, I am trying to create a five column table with the following columns:

Column 1: Each [EmployerName] listed in the employers table.

Columns 2&3: the [DateReceived]& [DuesAmounts] values from the row in the reports table where ReportMonth & ReportYear match the corresponding input variables, report type equals 'Monthly Dues Report' and employerID matches the employerID from the first column. If there are more than one entries in the reports table that match these criteria, use the ReportMonth & ReportYear from the row with the highest ReportID.

Columns 4,5&6: The [DateReceived], [DuesAmount] and [Notes] values from the row in the dueschecks table where ReportMonth & ReportYear match the corresponding input variables and employerID matches the employerID from the first column.

Outstanding Problems

The problem that I am having with this stored procedure, is that I only want one row in the outputted table for each employer in the employers table. Currently, there will be multiple rows for an employer if there are multiple rows in the reports table that match the criteia specified under "Columns 2&3" in the previous section of this post.

When I tried implementing bklr's suggestion, I get still get duplicate entries for some employers, and no entries for employers that don't have any rows in the reports table that match the required criteria (I want each employer to show up once, even if there is no matching row in the reports table).

Any suggestions anyone can provider are much appreciated.

Thanks!
Go to Top of Page

vijayisonly
Master Smack Fu Yak Hacker

1836 Posts

Posted - 2009-07-07 : 13:43:27
quote:
and no entries for employers that don't have any rows in the reports table that match the required criteria


Changing bklr's query a li'l bit...This should fix the above problem
SELECT   e.employername, 
Convert(VARCHAR(10),Max(r.datereceived),101) AS 'Date Report Received',
r.duesamount AS 'Reported Dues Amount',
Convert(VARCHAR(10),Max(d.datereceived),101) AS 'Date Check Received',
d.duesamount AS 'Check Amount',
d.notes AS notes
FROM employers e
LEFT JOIN reports AS r
ON r.reporttype = 'Monthly Dues Report'
AND r.employerid = e.employerid
AND r.reportmonth = @ReportMonth
AND r.reportyear = @ReportYear
LEFT JOIN (SELECT Max(reportid) as reportid,
employerid,
reportmonth,
reportyear
FROM reports
WHERE reporttype = 'Monthly Dues Report'
GROUP BY employerid,
reportmonth,
reportyear) AS r1
ON r1.employerid = r.employerid
AND r1.reportmonth = @ReportMonth
AND r1.reportyear = @ReportYear
LEFT JOIN dueschecks AS d
ON d.employerid = e.employerid
AND r.reportmonth = @ReportMonth
AND r.reportyear = @ReportYear
GROUP BY e.employername,
r.duesamount,
d.duesamount,
d.notes


quote:
When I tried implementing bklr's suggestion, I get still get duplicate entries for some employers


When you say duplicates, are all the values the same (the dates and amounts?), if not, which row do you want to choose.
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2009-07-07 : 13:44:58
This might work for you, if not post some sample DDL, sample data and expected output in a sql ready format (per this link):
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx
SELECT 
e.EmployerName,
CONVERT(varchar(10),MAX(r.DateReceived),101) AS 'Date Report Received',
r.DuesAmount as 'Reported Dues Amount',
CONVERT(varchar(10),MAX(d.DateReceived),101) AS 'Date Check Received',
d.DuesAmount as 'Check Amount',
d.Notes as Notes
FROM
employers e
LEFT JOIN
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY EmployerID ORDER BY ReportID DESC) AS RowNum
FROM
reports
WHERE
reporttype='Monthly Dues Report'
AND ReportMonth=@ReportMonth
AND ReportYear = @ReportYear
) AS r
ON r.EmployerID=e.EmployerID
AND r.RowNum = 1
LEFT JOIN
DuesChecks As d
ON d.EmployerID = e.EmployerID
AND d.ReportMonth = @ReportMonth
AND d.ReportYear = @ReportYear
GROUP BY
e.EmployerName,
r.DuesAmount,
d.DuesAmount,
d.Notes
Go to Top of Page

bwol
Starting Member

19 Posts

Posted - 2009-07-08 : 18:56:00
Lamprey, the code you provided did the trick. Thanks so much for your help!
Go to Top of Page
   

- Advertisement -