Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Administration (2000)
 Calculate Row Count large table

Author  Topic 

Wang
Starting Member

48 Posts

Posted - 2007-03-28 : 08:36:53
Hi
I have pretty large table. I discovered a while back that running a count(*) had lets say... a detrimental effect on system stability.

Anyway, I whipped up a small proc to go over each database and table every day, logging the rowcount (sysindexes.rows) and GB used. This was great until a few days back, where it appears that it grew past 2,147,483,647.

Sp_spaceused, EM and sysindexes.rows now top out at 2,147,483,647. Count(*) is as I mentioned a bit difficult.

Prior to it exceeding this Int we were happily applying 30m rows a day to the table, and afaik still are...

Is there any way you know of to get the rowcount another way? Or perhaps an ms hotfix to use bigints or something? I don't really fancy modding the system tables/procs myself as there may be unforseen consequences in doing so!

Cheers all

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-03-28 : 08:39:06
What about the COUNT_BIG function?


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Wang
Starting Member

48 Posts

Posted - 2007-03-28 : 10:09:13
You mean something like select count_big(*) from mytable?

As I say I experienced quite embarrassingly degraded performance last time I tried a count(*) from it... :(
Go to Top of Page

Wang
Starting Member

48 Posts

Posted - 2007-03-28 : 10:18:32
Ok, sorted. Now using the rowcnt column instead of rows.
Seems ok.

Ta.
Go to Top of Page

harsh_athalye
Master Smack Fu Yak Hacker

5581 Posts

Posted - 2007-03-28 : 10:18:42
I think this is the time you should seriously think about partitioning this large table into multiple smaller ones.

Harsh Athalye
India.
"The IMPOSSIBLE is often UNTRIED"
Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2007-03-28 : 13:06:54
What's the reason behind knowing "exactly" the count of the table?
Does it make a material difference to your business/chargback/internal planning process to know that you have "exactly x" records (where x is a v.large number) or that you have "about x" records, where x is derived from a reasonable baseline figure logged less often than at present or that x is derived from some "space calculation formula which would be reasonably accurate" given the needs of the process and which could be derived without affecting performance...ie by "calcing diskspace / average rowsize".

Rather than solve a problem...mnaybe it's not important to be accurate in the 1st place and other methods will be just as valuable at getting close to the answer and be less invasive!
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-03-28 : 13:58:32
Just think about the size of that table. Only the INT IDENTITY is more than 8 GB in size!
And then plus the indexes...



Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Wang
Starting Member

48 Posts

Posted - 2007-03-29 : 07:24:06
Yeah its pushing 400Gb.
Funnily enough mentioning the partitioning, today I am planning to start rolling out a federation to handle it better.
As for needing 'accuracy' I don't. I am quite happy with a daily count +- a million or so. This allows me to monitor the growth pattern effectively and determine whether it is simple linear growth or exponential. The difference being a headache or heartache...

But seriously if the exponential growth phase is over, and linear growth at 200GB a month is in place then the plan for the next few months is dramatically different to the plan to handle continued exponential growth. Doubling every month at a gig a month is comfy. Doubling for 6 months from a starting throughput of 200GB a month is painful.

So yes, its pretty important that I know how much we put in every day...
Go to Top of Page

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2007-03-29 : 07:37:50
"So yes, its pretty important that I know how much we put in every day..."...can you not get this info in GB from looking at the diskspace of the database on the lan....and ignore (or infer by working backwards) the number of records causing it?
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-03-29 : 07:44:20
Try this! It returns BIGINT!
SELECT	SUM(s.Row_Count) AS RowsInTable
FROM sys.dm_db_partition_stats AS s
WHERE s.OBJECT_ID = OBJECT_ID('Table1')
AND s.Index_ID < 2
Ooops. Sorry. It is for SQL 2005.


Peter Larsson
Helsingborg, Sweden
Go to Top of Page

Wang
Starting Member

48 Posts

Posted - 2007-03-29 : 10:25:35
can you not get this info in GB from looking at the diskspace of the database on the lan....and ignore (or infer by working backwards) the number of records causing it?

not really, the fragmentation could be quite extensive, and the usage of the varchars may change dramatically with usage. As it stands I use both the rowcount and the size, all derived from sysindexes currently. As I say, the problem is resolved using sysindexes.rowcnt rather than sysindexes.rows.

Ta to all.
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-03-29 : 10:35:27
Just an fyi: if you ever need to do any aggregate functions on this table, you might want to have a look here:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=75175

I got burned on taking an avg of a tinyint column on a table like this. Not quite as big as this one, but big enough to cause trouble.



www.elsasoft.org
Go to Top of Page

Wang
Starting Member

48 Posts

Posted - 2007-03-29 : 10:52:22
Hehe yes, I had a similar(ish) experience with checksums... trying to get the checksums of objects from the text in syscomments so I could easily establish if bobs.proc was the same as jills.proc.

The other interesting thing was the accuracy of deep decimals. Say you want to measure something quite accurately - say 17dp... then multiply or divide by another 17dp value. Suddenly you lose a bunch of dp without noticing. Again, this is explained in BOL - but only if you are looking for it, and can understand what it means!

As it happens I can't do any aggregate functions on it, as it tends to knock the system over if I do. :s

Go to Top of Page
   

- Advertisement -