Need help on Group By - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

Need help on Group By

Author

Topic

rajabadsha
Starting Member

4 Posts

Posted - 2007-12-07 : 11:15:39

I am having a dillema
I have the following table: Count_type with the columns:

NUM QUAD_TYPE
2 SQUARE
2 SQUARE
2 RECTANGLE
2 TRAPEZOID
3 SQUARE
3 SQUARE
4 RECTANGLE
4 SQUARE

I want to write a query that will output the count of each num type. For e.g. the output of the above table will be follows:

NUM QUAD_TYPE COUNT
2 SQUARE 2
2 RECTANGLE 1
2 TRAZEZOID 1
3 SQUARE 2
4 RECTANGLE 1
4 RECTANGLE 1

Any suggestions??

nr
SQLTeam MVY

12543 Posts

Posted - 2007-12-07 : 11:18:34

select *, (select count(*) frmo tbl t2 where t.QUAD_TYPE = t1.QUAD_TYPE)

select t.*, t2.num
from tbl t
join (select QUAD_TYPE, num = count(*) from tbl group by QUAD_TYPE) t2
on t.QUAD_TYPE = t2.QUAD_TYPE

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-07 : 11:22:23

SELECT Num, QuadType, COUNT(*) AS [Count]
FROM Count_Type
GROUP BY Num, QuadType

E 12°55'05.25"
N 56°04'39.16"

nr
SQLTeam MVY

12543 Posts

Posted - 2007-12-07 : 11:24:46

oops

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2007-12-07 : 11:32:03

and if you use SQL Server 2005

SELECT Num, QuadType, COUNT(*) over(partition by Num, QuadType) as [Count]
FROM Count_Type

Madhivanan

Failing to plan is Planning to fail

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-12-07 : 11:34:25

Madhi, throw in a DISTINCT also to get same result as OP posted.

E 12°55'05.25"
N 56°04'39.16"

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2007-12-07 : 12:09:04

quote:
Originally posted by Peso

Madhi, throw in a DISTINCT also to get same result as OP posted.

E 12°55'05.25"
N 56°04'39.16"

Yes I forgot the duplicates. Thanks

Madhivanan

Failing to plan is Planning to fail

rajabadsha
Starting Member

4 Posts

Posted - 2007-12-07 : 12:58:46

Thanks guys. You have been a lot of help.

rajabadsha
Starting Member

4 Posts

Posted - 2007-12-07 : 13:28:14

The results are showing correct with one glitch
Some of the cells in the original table are empty and count is counting them as 1. Is there a way to remove them...

raky
Aged Yak Warrior

767 Posts

Posted - 2008-02-07 : 05:00:22

Try this
select column_name, count(column_name) from table name group by column_name having column_name
I think u get the right answer

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-02-07 : 05:12:53

No, we don't think so.
Besides the fact the syntax is incorrect, where do you hide the NUM column?

E 12°55'05.25"
N 56°04'39.16"

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2008-02-07 : 07:13:02

quote:
Originally posted by raky

Try this
select column_name, count(column_name) from table name group by column_name having column_name
I think u get the right answer

No

Madhivanan

Failing to plan is Planning to fail

raky
Aged Yak Warrior

767 Posts

Posted - 2008-02-07 : 23:40:04

I think this is the solution for ur problem.

CREATE TABLE #s( num INT, quad_type VARCHAR(10))
INSERT INTO #s VALUES(2,'square')
INSERT INTO #s VALUES(2,'square')
INSERT INTO #s VALUES(2,'rectangle')
INSERT INTO #s VALUES(2,'trapezoid')
INSERT INTO #s VALUES(3,'square')
INSERT INTO #s VALUES(3,'square')
INSERT INTO #s VALUES(4,'rectangle')
INSERT INTO #s VALUES(4,'square')

--select * from #s

SELECT
num,
quad_type,
COUNT(num) AS 'ncount'
FROM
#s
GROUP BY
num,quad_type
ORDER BY
num,ncount DESC,quad_type

madhivanan
Premature Yak Congratulator

22864 Posts

Posted - 2008-02-08 : 01:11:11

quote:
Originally posted by raky

I think this is the solution for ur problem.

CREATE TABLE #s( num INT, quad_type VARCHAR(10))
INSERT INTO #s VALUES(2,'square')
INSERT INTO #s VALUES(2,'square')
INSERT INTO #s VALUES(2,'rectangle')
INSERT INTO #s VALUES(2,'trapezoid')
INSERT INTO #s VALUES(3,'square')
INSERT INTO #s VALUES(3,'square')
INSERT INTO #s VALUES(4,'rectangle')
INSERT INTO #s VALUES(4,'square')

--select * from #s

SELECT
num,
quad_type,
COUNT(num) AS 'ncount'
FROM
#s
GROUP BY
num,quad_type
ORDER BY
num,ncount DESC,quad_type

This is different from what you have suggested earlier

Madhivanan

Failing to plan is Planning to fail

Qualis
Posting Yak Master

145 Posts

Posted - 2008-02-08 : 14:54:33

Ok, I was very curious about the Count(*) Over performance and so I tested it. I selected from the Master database on the SysObjects and SysColumns tables to get a count of columns per table. Here are the results:


SELECT Distinct COUNT(*) Over (Partition by SO.name) as [Count],
	SO.name As TableName
From sysobjects SO
Inner Join syscolumns SC On SO.id = SC.id
Where SO.xtype = 'U'


--===================================================
SQL Server parse and compile time: 
   CPU time = 47 ms, elapsed time = 204 ms.

(6 row(s) affected)

Table 'Worktable'. Scan count 3, logical reads 125, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolrdb'. Scan count 1, logical reads 95, physical reads 0, read-ahead reads 107, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolpars'. Scan count 1, logical reads 8, physical reads 1, read-ahead reads 32, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysobjrdb'. Scan count 1, logical reads 26, physical reads 1, read-ahead reads 58, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 1, logical reads 13, physical reads 1, read-ahead reads 35, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 332 ms.

--===================================================
--===================================================
--===================================================

SELECT COUNT(*) cnt, SO.name
From sysobjects SO
Inner Join syscolumns SC On SO.id = SC.id
Where SO.xtype = 'U'
Group By SO.name


--===================================================
SQL Server parse and compile time: 
   CPU time = 63 ms, elapsed time = 204 ms.

(6 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolrdb'. Scan count 1, logical reads 95, physical reads 0, read-ahead reads 107, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'syscolpars'. Scan count 1, logical reads 8, physical reads 1, read-ahead reads 32, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysobjrdb'. Scan count 1, logical reads 26, physical reads 1, read-ahead reads 58, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'sysschobjs'. Scan count 1, logical reads 13, physical reads 1, read-ahead reads 35, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 334 ms.

Notice that "Table 'Worktable'" is scanned 3 times in the Count(*) Over example and 0 in the Group By example. Other than that, the performance was almost identical. These examples don't have much data, so I decided to test a larger table. Using a table with 1,894,322 rows, I did a similar group test. Here are the results:


SELECT Distinct COUNT(*) Over (Partition by AppDomainName) as [Count],
AppDomainName
From #Log

--===================================================
SQL Server parse and compile time: 
   CPU time = 16 ms, elapsed time = 41 ms.

(19 row(s) affected)

Table '#Log'. Scan count 3, logical reads 6792, physical reads 0, read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 6, logical reads 2074881, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


SQL Server Execution Times:
   CPU time = 17093 ms,  elapsed time = 49063 ms.


--===================================================
--===================================================
--===================================================


Select AppDomainName, Count(*) cnt
From #Log
Group By AppDomainName

--===================================================
SQL Server parse and compile time: 
   CPU time = 1860 ms, elapsed time = 2253 ms.

(19 row(s) affected)

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#Log'. Scan count 3, logical reads 6792, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


SQL Server Execution Times:
   CPU time = 1532 ms,  elapsed time = 802 ms.

Note the "logical reads" On 'Worktable' for each solution. For my tests, I used:
SET STATISTICS IO ON
SET STATISTICS TIME ON

and ran:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

between each execution.

Clearly Count(*) Over performs far worse than regular old grouping when working with large datasets.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources