Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 left join problem

Author  Topic 

sandhyakalsotra
Starting Member

22 Posts

Posted - 2010-09-11 : 01:20:18
I recently read one of the articles in your forum wherein you had explained how to use additional conditions in join statement only rather than in where clause. I used same but still I am encountering few problems. In my database I have 2 tables of "billprocess" and "revcoll" and to retrieve information of all the consumers billed for a particular month and the revenue received from them, i have to join both tables. I used the following query:
Select sum (unitsconsumed),count (consumercode), sum (amount)
From billprocess left join revcoll on billprocess.consumercode=revcoll.consumercode and revcoll.ryear=2010 and revcoll.rmonth=07 WHERE billprocess.blyear=2010 and billprocess.blmonth=07

The result is about 30 records more than the actual (which I queried from billprocess leaving aside revcoll)
any suggestions how to improve it?

sandhya

ConradK
Posting Yak Master

140 Posts

Posted - 2010-09-12 : 14:42:28
see also; Cartesian products
http://en.wikipedia.org/wiki/Cartesian_product

Joining creates a Cartesian product.

you're probably joining on the wrong things, or have unintended duplicates in one(or both) table(s).
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-13 : 04:49:12
Hi Sandhya

Can you provide us with the DDL?
Go to Top of Page

DuncanP
Starting Member

12 Posts

Posted - 2010-09-13 : 05:55:18
What you say sounds perfectly reasonable, depending on the data in your tables. A left join means that you will get all rows from the left hand table, but it does not mean that those rows will only appear once. If they match with more than one row in the right hand table they will appear more than once, just as in a normal inner join.

Say for example you have a Customers table and an Orders table - an inner join will show one record for each combination of customer and order - so if a customer has placed 2 orders they will appear twice - once for each order. A left join will show the same data as the inner join, but will also show customers that have not yet placed any orders.

In your case, it must be that some consumercodes appear more than once in the revcoll table, thus giving you more than one appearance of the corresponding row in the billprocess table. That should be easy enough to verify.

Duncan
Go to Top of Page

sandhyakalsotra
Starting Member

22 Posts

Posted - 2010-09-14 : 12:50:43
@ DuncanP...Yes, that is the case probably. As some consumers may have deposited 2 bills in same month either by mistake or to clear pending dues. And as a result there are few extra records.

Is there a way out to detect those ? I need the revenue total even if someone has deposited it twice. But problem shall occur with extra number of consumers and unitsconsumed due to doubling of those records.
I used DISTINCT along with consumercode. It at least removed the redundancy in total number of consumers, but the unitsconsumed stayed as earlier.

please let me know how to set it right?

sandhya
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 03:36:57
quote:
Originally posted by sandhyakalsotra

@ DuncanP...Yes, that is the case probably. As some consumers may have deposited 2 bills in same month either by mistake or to clear pending dues. And as a result there are few extra records.

Is there a way out to detect those ?

sandhya



You van use the following query to get the duplicates in a particular month


SELECT consumercode,COUNT(consumercode) AS [Repeations]
FROM revcoll
GROUP BY consumercode, revcoll.ryear, revcoll.rmonth
HAVING COUNT(consumercode) > 1 AND revcoll.ryear=2010
AND revcoll.rmonth=07
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 04:25:38
quote:
Originally posted by sandhyakalsotra
I need the revenue total even if someone has deposited it twice. But problem shall occur with extra number of consumers and unitsconsumed due to doubling of those records.
I used DISTINCT along with consumercode. It at least removed the redundancy in total number of consumers, but the unitsconsumed stayed as earlier.

please let me know how to set it right?

sandhya



Hi sandhya,

I have created the tables with some sample data

TABLE billprocess
consumercode blyear blmonth unitsconsumed
1 2010 1 20
2 2010 2 25
1 2010 2 30
2 2010 3 30
3 2010 3 30
3 2010 1 30
3 2010 2 40
3 2010 4 50
2 2010 4 50

TABLE revcoll

consumercode ryear rmonth amount
1 2010 2 200
1 2010 2 300
2 2010 2 250
2 2010 3 300
3 2010 1 300
3 2010 2 400
3 2010 2 300

I have put some duplicate entries in revcoll table for the rmonth=2(consumercode 1 & 3) to replicate your senario.You can verify
it by running the following query

SELECT consumercode,COUNT(consumercode) AS [Repeations]
FROM revcoll
GROUP BY consumercode, revcoll.ryear, revcoll.rmonth
HAVING COUNT(consumercode) > 1 AND revcoll.ryear=2010
AND revcoll.rmonth=2

which will give the following output

consumercode Repeations
1 2
3 2


On running the following query on table billprocess

SELECT COUNT(consumercode) [TotCust],SUM(unitsconsumed) AS [TotUnits] FROM billprocess WHERE blmonth=2

the result is:

TotCust TotUnits
3 95

which is correct

and on table revcoll

SELECT SUM(amount) AS [Revenue] FROM revcoll WHERE rmonth=2

the result is:

Revenue
1450

which is also correct.

but on running your query

SELECT COUNT(distinct billprocess.consumercode) AS TOtalCust,SUM(unitsconsumed) AS [TotalUnits], SUM(amount) AS [Amount]
FROM billprocess
LEFT JOIN revcoll
ON billprocess.consumercode=revcoll.consumercode
AND revcoll.ryear=2010 AND revcoll.rmonth=2 WHERE billprocess.blyear=2010 AND billprocess.blmonth=2

the o/p is:

TOtalCust TotalUnits Amount
3 165 1450
the TotalUnits is 165 which is wrong and should have been 95

I assume this is your case and will post the solution soon. However, I need to have the list of columns in the tables billprocess & revcoll.



Thanks
Rohit
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 04:35:01
hi Sandhya,

Try the following query


SELECT COUNT(distinct billprocess.consumercode) AS [TotalCustomers],SUM(distinct unitsconsumed) AS [TotalUnits], SUM(amount) AS [TotalAmount]
FROM billprocess
LEFT JOIN revcoll
ON billprocess.consumercode=revcoll.consumercode
AND revcoll.ryear=2010 AND revcoll.rmonth=2 WHERE billprocess.blyear=2010 AND billprocess.blmonth=2

o/p:

TotalCustomers TotalUnits TotalAmount
3 95 1450

gives me the correct result on my sample data.

Thanks
Rohit

Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 04:56:33
quote:
Originally posted by rohitvishwakarma

hi Sandhya,

Try the following query


SELECT COUNT(distinct billprocess.consumercode) AS [TotalCustomers],SUM(distinct unitsconsumed) AS [TotalUnits], SUM(amount) AS [TotalAmount]
FROM billprocess
LEFT JOIN revcoll
ON billprocess.consumercode=revcoll.consumercode
AND revcoll.ryear=2010 AND revcoll.rmonth=2 WHERE billprocess.blyear=2010 AND billprocess.blmonth=2

o/p:

TotalCustomers TotalUnits TotalAmount
3 95 1450

gives me the correct result on my sample data.

Thanks
Rohit





SELECT COUNT(distinct billprocess.consumercode) AS [TotalCustomers],SUM(distinct unitsconsumed) AS [TotalUnits], SUM(amount) AS [TotalAmount]
FROM billprocess
LEFT JOIN revcoll
ON billprocess.consumercode=revcoll.consumercode
AND revcoll.ryear=2010 AND revcoll.rmonth=2 WHERE billprocess.blyear=2010 AND billprocess.blmonth=2


I am sorry the above query is logically wrong even it is working in my case. Can't have Distinct with unitsconsumed(different consumers can have same amount of units). Let me find another solution to it.
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 05:20:23

This is the solution, without using any joins:

SELECT COUNT(consumercode) [TotalConsumers], ISNULL(SUM(unitsconsumed),0) AS [TotalUnits] ,
( SELECT ISNULL(SUM(amount),0) FROM revcoll WHERE rmonth = 7 AND ryear = 2010 ) AS [Revenue]
FROM billprocess WHERE blmonth = 7 AND blyear = 2010
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 08:05:49
Can I have the list of columns?
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 08:56:02
If you are having a column similar to currentMonthBill in revcoll table,following query will give you the result. To be specific you should have a column in revcoll table which distinguishes between the payments made by a consumer in same month.

SELECT COUNT(distinct billprocess.consumercode) AS [TotalCustomers],SUM(
CASE WHEN revcoll.currentMonthBill=1 OR revcoll.currentMonthBill is NULL
THEN unitsconsumed
ELSE 0
END
) AS [TotalUnits]
, SUM(amount) AS [TotalAmount]
FROM billprocess
LEFT JOIN revcoll
ON billprocess.consumercode=revcoll.consumercode
AND revcoll.ryear=2010 AND revcoll.rmonth=3 WHERE billprocess.blyear=2010 AND billprocess.blmonth=3
Go to Top of Page

sandhyakalsotra
Starting Member

22 Posts

Posted - 2010-09-15 : 08:57:11
@ rohitvishwakarma...I too tried the query using DISTINCT unitsconsumed but as you said, it's logically wrong. Many consumers are with same units consumed and it turns out false.
Here is the list of columns
billprocess: regionid,divcode,subdivcode,consumercode,cname,billaddress,blyear,blmonth,sancload,phase,ratecode, constypecode,prevunits,currentunits,unitsconsumed,energycharges,electduty,demandcharges,fixedcharges, totalcharges
Revcoll:

regionid,divcode,subdivcode,consumercode,ryear,rmonth,rcptdate,rcptno,amount,chequeno,entereddate, enteredby



sandhya
Go to Top of Page

sandhyakalsotra
Starting Member

22 Posts

Posted - 2010-09-15 : 09:02:37
No, there is no such column in revcoll that can tell the bill amount for a particular month. The revcoll only holds the record of receipts deposited by consumer datewise.

sandhya
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 09:08:22
quote:
Originally posted by sandhyakalsotra

@ rohitvishwakarma...I too tried the query using DISTINCT unitsconsumed but as you said, it's logically wrong. Many consumers are with same units consumed and it turns out false.
Here is the list of columns
billprocess: regionid,divcode,subdivcode,consumercode,cname,billaddress,blyear,blmonth,sancload,phase,ratecode, constypecode,prevunits,currentunits,unitsconsumed,energycharges,electduty,demandcharges,fixedcharges, totalcharges
Revcoll:

regionid,divcode,subdivcode,consumercode,ryear,rmonth,rcptdate,rcptno,amount,chequeno,entereddate, enteredby



sandhya



Posted - 09/15/2010 : 05:20:23
--------------------------------------------------------------------------------


Without using any joins:

SELECT COUNT(consumercode) [TotalConsumers], ISNULL(SUM(unitsconsumed),0) AS [TotalUnits] ,
( SELECT ISNULL(SUM(amount),0) FROM revcoll WHERE rmonth = 7 AND ryear = 2010 ) AS [Revenue]
FROM billprocess WHERE blmonth = 7 AND blyear = 2010

Is this query not working? See with Left Join, you need to have a column in revcoll table to distinguish between the unit consumed by a consumer as it will be repeated for the number of times payment made by a consumer in a month. After going through the list of columns which you have provided, I can't find any.
The query above don't have such dependency and if you are not rigid about Left Join then I think better to use the query. And were you able to find the duplicates in revcoll with query I posted?
Go to Top of Page

sandhyakalsotra
Starting Member

22 Posts

Posted - 2010-09-15 : 11:08:41
Thanx this query is working.
Thanx a lot.

sandhya
Go to Top of Page

rohitvishwakarma
Posting Yak Master

232 Posts

Posted - 2010-09-15 : 11:21:25
quote:
Originally posted by sandhyakalsotra

Thanx this query is working.
Thanx a lot.

sandhya



Always Welcome

Rohit
Go to Top of Page
   

- Advertisement -