Interview question in ISOFT,Chennai

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Interview question in ISOFT,Chennai

Author

Topic

ganeshkumar08
Posting Yak Master

187 Posts

Posted - 2008-10-19 : 03:12:06

Difference Between

Select Count(*) From EMP

Select Count(Column) From EMP

In the above 2 queries which one shows result quickly, Why??.

Thanks
Ganesh

Solutions are easy. Understanding the problem, now, that's the hard part

afrika
Master Smack Fu Yak Hacker

2706 Posts

Posted - 2008-10-19 : 03:18:40

count(*) counts all columns
count(column) counts only that column's existence

Guess both work at the same speed. Because its just implementing the count function

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-10-19 : 03:23:40

i've heard second one is quicker and better in terms of performance. as count(*) uses more resources it will perform slower compared to first.
Also please note that the results of both queries will be different if column you use inside the count is NULLable. count(column) returns only non NULL instances of column while count(*) returns total record count.

ganeshkumar08
Posting Yak Master

187 Posts

Posted - 2008-10-19 : 03:46:49

Thank you very much, its very informative

Solutions are easy. Understanding the problem, now, that's the hard part

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-10-19 : 04:08:48

quote:
Originally posted by ganeshkumar08

Thank you very much, its very informative

Solutions are easy. Understanding the problem, now, that's the hard part

welcome

afrika
Master Smack Fu Yak Hacker

2706 Posts

Posted - 2008-10-19 : 04:22:56

quote:
Originally posted by visakh16

i've heard second one is quicker and better in terms of performance. as count(*) uses more resources it will perform slower compared to first.
Also please note that the results of both queries will be different if column you use inside the count is NULLable. count(column) returns only non NULL instances of column while count(*) returns total record count.

Well I think we are both wrong

, I ran a few tests below. On my largest table. Has a few millions of entries.



declare @s datetime , @m datetime
set @s = GETDATE()
select COUNT(*) from [dbo].[sms_history]
set @m = GETDATE()

print @s
print @m
print DATEDIFF(nanosecond,@s,@m)


set @s = GETDATE()
select COUNT(message) from [dbo].[sms_history]
set @m = GETDATE()

print @s
print @m
print DATEDIFF(nanosecond,@s,@m)

And I got this back

(1 row(s) affected)
Oct 19 2008 9:19AM
Oct 19 2008 9:19AM
63000000

(1 row(s) affected)
Oct 19 2008 9:19AM
Oct 19 2008 9:19AM
93000000

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-10-19 : 04:43:07

i didnt understand how you got COUNT(message) more than COUNT(*). are you sure you didnt use any joins. Were you using same query you posted?

afrika
Master Smack Fu Yak Hacker

2706 Posts

Posted - 2008-10-19 : 04:56:12

Try for yourself, I tried over and over again. Same results. I tried exactly what i posted



declare @s datetime , @m datetime
set @s = GETDATE()
select COUNT(*) from [dbo].[sms_history]
set @m = GETDATE()

print @s
print @m
print DATEDIFF(nanosecond,@s,@m)


set @s = GETDATE()
select COUNT(message) from [dbo].[sms_history]
set @m = GETDATE()

print @s
print @m
print DATEDIFF(nanosecond,@s,@m)





set @s = GETDATE()
select COUNT(message) from [dbo].[sms_history]
set @m = GETDATE()

print @s
print @m
print DATEDIFF(nanosecond,@s,@m)


set @s = GETDATE()
select COUNT(*) from [dbo].[sms_history]
set @m = GETDATE()

print @s
print @m
print DATEDIFF(nanosecond,@s,@m)








(1 row(s) affected)
Oct 19 2008  9:54AM
Oct 19 2008  9:54AM
63000000

(1 row(s) affected)
Oct 19 2008  9:54AM
Oct 19 2008  9:54AM
93000000

(1 row(s) affected)
Oct 19 2008  9:54AM
Oct 19 2008  9:54AM
93000000

(1 row(s) affected)
Oct 19 2008  9:54AM
Oct 19 2008  9:54AM
46000000

afrika
Master Smack Fu Yak Hacker

2706 Posts

Posted - 2008-10-19 : 04:57:34

COUNT(*) is defintely faster.

Surprises me

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-10-19 : 04:57:53

i'm not near a SQL box so cant test it now. That seems a bit strange to me. Anyways will test it tomorrow and let you know the results.

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-10-19 : 05:04:45

If you post the execution plans it will immediately become clear why COUNT(*) is faster in this case.

afrika
Master Smack Fu Yak Hacker

2706 Posts

Posted - 2008-10-19 : 05:23:57

quote:
Originally posted by Arnold Fribble

If you post the execution plans it will immediately become clear why COUNT(*) is faster in this case.

Funny, I was looking at that. But it seems the same to me.

run the query on your table and see. But both utilize the same amount of resources. I guess, count(*) is faster, better.

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2008-10-19 : 10:13:14

When SQL runs a Count(*), it can scan the smallest index on the table, as all it has to do is count the number of rows. There are no conditions it needs to take into account.
Count(column1) is equivalent to Count(*) where column1 is not null, and hence SQL has to evaluate the values in that column as it's doing the count. If there's no index on that column, it will have to do a table scan.

--
Gail Shaw
SQL Server MVP

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-10-19 : 13:18:28

quote:
Originally posted by GilaMonster

When SQL runs a Count(*), it can scan the smallest index on the table, as all it has to do is count the number of rows. There are no conditions it needs to take into account.
Count(column1) is equivalent to Count(*) where column1 is not null, and hence SQL has to evaluate the values in that column as it's doing the count. If there's no index on that column, it will have to do a table scan.

--
Gail Shaw
SQL Server MVP

and any reasoning for count(column) to be greated than count(*)? its something that i never expected so would like to know if there are any conditions under which it might happen?

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-10-19 : 14:37:50

I don't think that was the count, it was the number of nanoseconds.

But I, like Gail, was assuming that the COUNT(*) was using the narrowest index on the table where the COUNT(message) of a NULLable column was having to use a wider index.

However, even when they're using the same index it seems there's a there's a significant difference. Obviously it has to check for NULLs with the COUNT(message), but I was expecting the I/O of fetching the data pages to dominate the timings. Hmm... I suppose there is less memory I/O for COUNT(*) since it only has to look at the row count in the page headers (possibly subtract the ghost rows too), and not have to look at the NULL bitmap of each row separately.

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2008-10-19 : 15:27:03

quote:
Originally posted by visakh16
and any reasoning for count(column) to be greated than count(*)? its something that i never expected so would like to know if there are any conditions under which it might happen?

Shouldn't be any case where that can happen. I'd start looking for bugs or corruption if I ever saw that.

What I think africa posted was the result of "print DATEDIFF(nanosecond,@s,@m)", not the rowcount. That looks like the output of the messages pane.

Here's another example (adventureworks). I've added data to the table in question, so other people's results will be slightly different.


set statistics time on
go
set nocount on

select COUNT(*) as CountAll from  Sales.SalesOrderDetail

select COUNT(CarrierTrackingNumber) AS CountColumn from  Sales.SalesOrderDetail

Results:

CountAll
-----------
1941072

SQL Server Execution Times:
CPU time = 187 ms, elapsed time = 189 ms.

CountColumn
-----------
974704

Warning: Null value is eliminated by an aggregate or other SET operation.

SQL Server Execution Times:
CPU time = 422 ms, elapsed time = 449 ms.

--
Gail Shaw
SQL Server MVP

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-10-20 : 02:14:19

quote:
Originally posted by GilaMonster

quote:
Originally posted by visakh16
and any reasoning for count(column) to be greated than count(*)? its something that i never expected so would like to know if there are any conditions under which it might happen?

Shouldn't be any case where that can happen. I'd start looking for bugs or corruption if I ever saw that.

What I think africa posted was the result of "print DATEDIFF(nanosecond,@s,@m)", not the rowcount. That looks like the output of the messages pane.

Here's another example (adventureworks). I've added data to the table in question, so other people's results will be slightly different.

set statistics time on
go
set nocount on

select COUNT(*) as CountAll from Sales.SalesOrderDetail

select COUNT(CarrierTrackingNumber) AS CountColumn from Sales.SalesOrderDetail

Results:

CountAll
-----------
1941072

SQL Server Execution Times:
CPU time = 187 ms, elapsed time = 189 ms.

CountColumn
-----------
974704

Warning: Null value is eliminated by an aggregate or other SET operation.

SQL Server Execution Times:
CPU time = 422 ms, elapsed time = 449 ms.

--
Gail Shaw
SQL Server MVP

Ok. I see. Thanks Gail for explanation.

darkdusky
Aged Yak Warrior

591 Posts

Posted - 2008-10-20 : 05:27:30

I was also surprised when I tested this. But the fastest count is still count(column) when the column is a Primary Key - which isn't really surprising. The next fastest is the count(*), then count (column) when column is not a Primary Key.

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2008-10-20 : 07:07:17

quote:
Originally posted by darkdusky

I was also surprised when I tested this. But the fastest count is still count(column) when the column is a Primary Key - which isn't really surprising. The next fastest is the count(*), then count (column) when column is not a Primary Key.

set statistics time on
go
set nocount on

select COUNT(*) as CountAll from  Production.TransactionHistory

select COUNT(TransactionID) AS CountPK from  Production.TransactionHistory

CountAll
-----------
1933198

SQL Server Execution Times:
CPU time = 172 ms, elapsed time = 179 ms.

CountPK
-----------
1933198

SQL Server Execution Times:
CPU time = 172 ms, elapsed time = 174 ms.

The times bounced around. Sometimes the 1st was faster, sometimes the 2nd. Difference was never more than 20ms, which can be accounted for by other apps also using the processor.

--
Gail Shaw
SQL Server MVP

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-10-20 : 07:20:37

quote:
Originally posted by darkdusky

I was also surprised when I tested this. But the fastest count is still count(column) when the column is a Primary Key - which isn't really surprising. The next fastest is the count(*), then count (column) when column is not a Primary Key.

No. COUNT(column) for a NOT NULL column and COUNT(*) always generate the same execution plan, right down to both of them containing "Count(*)" in the aggregation operation.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources