Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Select with Group By

Author  Topic 

Piemur1
Starting Member

10 Posts

Posted - 2014-08-05 : 14:19:54
Hello, I have a little problem getting a query I'm trying to build working. I have a large database of user records and am trying to get a count of how many email addresses of potential duplicate records there are, broken down by the first letter of the email address. This is the query I am using to get the list of email addresses:

SELECT DISTINCT A.Email_Address
FROM User_x AS A
INNER JOIN User_x AS B
ON A.Email_Address = B.Email_Address AND A.Login_Name != B.Login_Name
WHERE A.Last_Name LIKE B.Last_Name
ORDER BY A.Email_Address

As you can see, I'm using an inner join to find any records that have the exact same email address, but is a different login name. Now, I want to build a query that will give me the counts of all duplicate email addresses broken down by first letter. So I built this query:

SELECT LEFT(A.Email_Address, 1), COUNT(LEFT(A.Email_Address, 1))
FROM User_x AS A
INNER JOIN User_x AS B
ON A.Email_Address = B.Email_Address AND A.Login_Name != B.Login_Name
WHERE A.Last_Name LIKE B.Last_Name
GROUP BY LEFT(A.Email_Address, 1)
ORDER BY LEFT(A.Email_Address, 1)

Except this query is flawed in that it is giving me counts of every record with those email addresses (as in every record that has a duplicate would give a count of at LEAST 2 per email address) and would give me an inaccurate number. I had to remove the DISTINCT clause, because it would either give me a syntax error or would give me counts of exactly 1 for every letter (dependant on where the DISTINCT clause was placed).

This query is something that I would LIKE to build, but am unable to.

SELECT LEFT(DISTINCT A.Email_Address, 1), COUNT(LEFT(DISTINCT A.Email_Address, 1))
FROM User_x AS A
INNER JOIN User_x AS B
ON A.Email_Address = B.Email_Address AND A.Login_Name != B.Login_Name
WHERE A.Last_Name LIKE B.Last_Name
GROUP BY LEFT(DISTINCT A.Email_Address, 1)
ORDER BY LEFT(DISTINCT A.Email_Address, 1)

Also please note this is a production database, so I will not be building any new tables.

Piemur1
Starting Member

10 Posts

Posted - 2014-08-05 : 14:24:58
I am trying to get a results list something like this:

A 6
B 7
C 11
D 657
E 4
F 2
G 3
H 1
I 2
J 1007
K 26
L 21
M 16
N 4
P 2
R 19
S 870
T 342
W 6
Y 2

(I had to cheat and use excel to actually get these results)
Go to Top of Page

ScottPletcher
Aged Yak Warrior

550 Posts

Posted - 2014-08-05 : 14:27:19
I believe you can replace your first query with the query below. Then use the second query to get the totals you want, adjusting it if/as needed.


SELECT Email_Address, Last_Name, COUNT(DISTINCT Login_Name) AS Total_Logins
FROM User_x
GROUP BY Email_Address, Last_Name
HAVING COUNT(DISTINCT Login_Name) > 1
ORDER BY Email_Address


SELECT LEFT(Email_Address, 1) AS Email_First_Letter, SUM(Total_Logins)AS Total_Logins
FROM (
SELECT Email_Address, Last_Name, COUNT(DISTINCT Login_Name) AS Total_Logins
FROM User_x
GROUP BY Email_Address, Last_Name
HAVING COUNT(DISTINCT Login_Name) > 1
) AS subquery1
GROUP BY LEFT(Email_Address, 1)
ORDER BY Email_First_Letter

Go to Top of Page

Piemur1
Starting Member

10 Posts

Posted - 2014-08-05 : 14:38:21
That works a bit better, but is still giving me counts of all the users with duplicate emails. I want to get just the total number of emails by itself, so I had to adjust your query a bit and am getting some record counts where the last name are not the same. There are some records that have the same email address but for different people (such as org boxes) and I do not want to include those in the count.

Thank you for your assistance, and I had not really considered pulling the query in a nest like that.
Go to Top of Page
   

- Advertisement -