Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 More UNION fun

Author  Topic 

gravyface
Starting Member

16 Posts

Posted - 2009-01-23 : 16:20:14
I've got a Contacts table with lang1, lang2, lang3, which are all varchars. For each distinct language in either lang1, lang2, or lang3, I want to count how many clients speak English, German, etc.

Here's what I've got so far:

SELECT
Languages, Totals = COUNT(*)
FROM
(
/* LANG 1 totals */
SELECT Lang1 as Languages
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT' AND Companies.created BETWEEN '01/01/2008' and '12/31/2008'
AND (Lang1 <> '')
UNION ALL

/* LANG 2 totals */
SELECT Lang2 as Languages
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT' AND Companies.created BETWEEN '01/01/2008' and '12/31/2008'
AND (Lang2 <> '')
UNION ALL

/* LANG 3 totals */
SELECT Lang3 as Languages
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT' AND (Companies.created BETWEEN '01/01/2008' and '12/31/2008')
AND (Lang3 <> '')
UNION ALL)
)t
GROUP BY
Languages
ORDER BY Languages ASC

The sum of all language totals is 131

Choosing a random sample (I chose "English"), I contructed another query to see if the numbers were right:

SELECT COUNT(*)
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT' AND (Companies.created BETWEEN '01/01/2008' and '12/31/2008')
AND (Lang1 = 'English' OR Lang2 = 'English' OR Lang3 = 'English')

Same total for 'English' (103) as produced by my UNION-based query; the sum of the totals was 131 as well.

However, the following returns 121 for the sum of the totals:

SELECT COUNT(*)
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT' AND (Companies.created BETWEEN '01/01/2008' and '12/31/2008')

which doesn't equal the same sum of language totals as the first two queries, which seems to me like something is not right: I assumed that the sum of the totals would be the same as the number of clients that meet the non-language criteria in the WHERE clause.

raky
Aged Yak Warrior

767 Posts

Posted - 2009-01-24 : 00:05:47


Replace union all with Union as Union all produces duplicates
Go to Top of Page

gravyface
Starting Member

16 Posts

Posted - 2009-01-24 : 08:34:56
quote:
Originally posted by raky



Replace union all with Union as Union all produces duplicates



That returns 1 for each total. I think I have to have UNION ALL in there because any/all of lang1, lang2, or lang3 could be empty.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-01-24 : 10:05:16
The first number is "wrong". If both Lang1 and Lang2 is equal to ENGLISH, you will get a higher sum in first query.

SELECT SUM(CASE WHEN Lang1 = 'English' THEN 1 ELSE 0 END) AS Lang1,
SUM(CASE WHEN Lang2 = 'English' THEN 1 ELSE 0 END) AS Lang2,
SUM(CASE WHEN Lang3 = 'English' THEN 1 ELSE 0 END) AS Lang3,
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT' AND (Companies.created BETWEEN '01/01/2008' and '12/31/2008')



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

gravyface
Starting Member

16 Posts

Posted - 2009-01-24 : 10:26:13
That's what I originally suspected too but the following series of queries didn't return any duplicates across columns:

SELECT COUNT(*)
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT'
AND (Companies.created BETWEEN '01/01/2008' and '12/31/2008')
AND (Contacts.Lang1 = 'English' AND Contacts.Lang2 = 'English')

...
AND (Contacts.Lang1 = 'English' AND Contacts.Lang3 = 'English')


...
AND (Contacts.Lang2 = 'English' AND Contacts.Lang3 = 'English')
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-01-24 : 11:07:20
That query returns records where Lang1, Lang2 and Lang3 all are set to English.



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-01-24 : 11:09:54
SELECT *,
CASE
WHEN Lang1 = 'English' AND Lang2 = 'English' AND Lang3 = 'English' THEN 'Lang1, Lang2 and Lang3'
WHEN Lang1 = 'English' AND Lang2 = 'English' THEN 'Lang1 and Lang2'
WHEN Lang1 = 'English' AND Lang3 = 'English' THEN 'Lang1 and Lang3'
WHEN Lang2 = 'English' AND Lang3 = 'English' THEN 'Lang2 and Lang3'
WHEN Lang1 = 'English' THEN 'Lang1'
WHEN Lang2 = 'English' THEN 'Lang2'
WHEN Lang3 = 'English' THEN 'Lang3'
END AS [Settings]
FROM Contacts
INNER JOIN Companies
ON Companies.id = Contacts.company_id
WHERE Contacts.type = 'CLIENT' AND (Companies.created BETWEEN '01/01/2008' and '12/31/2008')


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

gravyface
Starting Member

16 Posts

Posted - 2009-01-24 : 12:56:00
quote:
Originally posted by Peso

That query returns records where Lang1, Lang2 and Lang3 all are set to English.



No, those are three separate queries (... is just so I didn't have to type it all out again). Each returned 0 rows.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-01-25 : 03:21:10
How should we know?



E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

gravyface
Starting Member

16 Posts

Posted - 2009-01-25 : 10:19:48
quote:
Originally posted by Peso

How should we know?



Any ideas why the numbers are off?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2009-01-25 : 11:25:28
Yes.
I have describes why.


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page
   

- Advertisement -