Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 How to handle a variable number of self-joins?

Author  Topic 

sqlpupil
Starting Member

5 Posts

Posted - 2007-11-09 : 10:30:31
Background: Customer sale records are kept in table (userHistory) that records: bookID, userID, and salesDate.

Objective: To find all the customers that have purchased the same books as any given customer, say customer1. For example, customer1 bought books bookA, bookB, and bookC; the goal is to find all of the other customers who also bought these books.

My First Attempt:

SELECT DISTINCT UH1.userID
FROM userHistory as UH1, userHistory as UH2, userHistory as UH3
WHERE UH1.userid = UH2.userID
AND UH2.userid = UH3.userID
AND UH1.bookID = ‘bookA’
AND UH2.bookID = ‘bookB’
AND UH3.bookID = ‘bookC’
AND UH1.userID <> ‘customer1’

The problem here is that it cannot handle any more/less than 3 books. The above should work well for 3 books, but what if there are 2 books or 12 books? Does anybody have any suggested queries to handle a variable amount of books (eg joins)?

Thank you kindly.
Adam

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-11-09 : 10:38:37
This should work
SELECT		c.UserID
FROM UserHistory AS o
LEFT JOIN UserHistory AS c ON c.BookID = o.BookID
AND c.UserID <> o.UserID
WHERE o.UserID = 'Customer1'
GROUP BY o.UserID,
c.UserID
HAVING COUNT(o.BookID) = COUNT(c.BookID)
In case your customers have bought same book twice or more, try this
SELECT		c.UserID
FROM UserHistory AS o
LEFT JOIN UserHistory AS c ON c.BookID = o.BookID
AND c.UserID <> o.UserID
WHERE o.UserID = 'Customer1'
GROUP BY o.UserID,
c.UserID
HAVING COUNT(DISTINCT o.BookID) = COUNT(DISTINCT c.BookID)

E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-11-09 : 10:41:02
The code above will get you the other customers who have bought at least all books that Customer1 has bought.
If you want a partial match, ie get all other customers who have at least bought the same x number of books as Customer1 has, try these
SELECT		c.UserID
FROM UserHistory AS o
LEFT JOIN UserHistory AS c ON c.BookID = o.BookID
AND c.UserID <> o.UserID
WHERE o.UserID = 'Customer1'
AND o.BookID IN ('BookA', 'BookB', 'BookC')
GROUP BY o.UserID,
c.UserID
HAVING COUNT(o.BookID) = COUNT(c.BookID)

SELECT c.UserID
FROM UserHistory AS o
LEFT JOIN UserHistory AS c ON c.BookID = o.BookID
AND c.UserID <> o.UserID
WHERE o.UserID = 'Customer1'
AND o.BookID IN ('BookA', 'BookB', 'BookC')
GROUP BY o.UserID,
c.UserID
HAVING COUNT(DISTINCT o.BookID) = COUNT(DISTINCT c.BookID)

E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-11-09 : 13:21:39
I guess this also was ASAP and I was too late...



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

sqlpupil
Starting Member

5 Posts

Posted - 2007-11-09 : 13:29:13
Peso -- thank you ever so much for your fast responses.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-11-09 : 13:36:59
Thank you too.
What I want to know is if it was accurate.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

sqlpupil
Starting Member

5 Posts

Posted - 2007-11-09 : 18:40:42
I tried the first suggestion that you provided: it worked. Thanks again.
Go to Top of Page
   

- Advertisement -