Author |
Topic |
Wang
Starting Member
48 Posts |
Posted - 2007-03-28 : 08:36:53
|
HiI have pretty large table. I discovered a while back that running a count(*) had lets say... a detrimental effect on system stability.Anyway, I whipped up a small proc to go over each database and table every day, logging the rowcount (sysindexes.rows) and GB used. This was great until a few days back, where it appears that it grew past 2,147,483,647. Sp_spaceused, EM and sysindexes.rows now top out at 2,147,483,647. Count(*) is as I mentioned a bit difficult. Prior to it exceeding this Int we were happily applying 30m rows a day to the table, and afaik still are...Is there any way you know of to get the rowcount another way? Or perhaps an ms hotfix to use bigints or something? I don't really fancy modding the system tables/procs myself as there may be unforseen consequences in doing so!Cheers all |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-03-28 : 08:39:06
|
What about the COUNT_BIG function?Peter LarssonHelsingborg, Sweden |
 |
|
Wang
Starting Member
48 Posts |
Posted - 2007-03-28 : 10:09:13
|
You mean something like select count_big(*) from mytable?As I say I experienced quite embarrassingly degraded performance last time I tried a count(*) from it... :( |
 |
|
Wang
Starting Member
48 Posts |
Posted - 2007-03-28 : 10:18:32
|
Ok, sorted. Now using the rowcnt column instead of rows.Seems ok.Ta. |
 |
|
harsh_athalye
Master Smack Fu Yak Hacker
5581 Posts |
Posted - 2007-03-28 : 10:18:42
|
I think this is the time you should seriously think about partitioning this large table into multiple smaller ones.Harsh AthalyeIndia."The IMPOSSIBLE is often UNTRIED" |
 |
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2007-03-28 : 13:06:54
|
What's the reason behind knowing "exactly" the count of the table?Does it make a material difference to your business/chargback/internal planning process to know that you have "exactly x" records (where x is a v.large number) or that you have "about x" records, where x is derived from a reasonable baseline figure logged less often than at present or that x is derived from some "space calculation formula which would be reasonably accurate" given the needs of the process and which could be derived without affecting performance...ie by "calcing diskspace / average rowsize".Rather than solve a problem...mnaybe it's not important to be accurate in the 1st place and other methods will be just as valuable at getting close to the answer and be less invasive! |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-03-28 : 13:58:32
|
Just think about the size of that table. Only the INT IDENTITY is more than 8 GB in size!And then plus the indexes...Peter LarssonHelsingborg, Sweden |
 |
|
Wang
Starting Member
48 Posts |
Posted - 2007-03-29 : 07:24:06
|
Yeah its pushing 400Gb.Funnily enough mentioning the partitioning, today I am planning to start rolling out a federation to handle it better.As for needing 'accuracy' I don't. I am quite happy with a daily count +- a million or so. This allows me to monitor the growth pattern effectively and determine whether it is simple linear growth or exponential. The difference being a headache or heartache...But seriously if the exponential growth phase is over, and linear growth at 200GB a month is in place then the plan for the next few months is dramatically different to the plan to handle continued exponential growth. Doubling every month at a gig a month is comfy. Doubling for 6 months from a starting throughput of 200GB a month is painful.So yes, its pretty important that I know how much we put in every day... |
 |
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2007-03-29 : 07:37:50
|
"So yes, its pretty important that I know how much we put in every day..."...can you not get this info in GB from looking at the diskspace of the database on the lan....and ignore (or infer by working backwards) the number of records causing it? |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2007-03-29 : 07:44:20
|
Try this! It returns BIGINT!SELECT SUM(s.Row_Count) AS RowsInTableFROM sys.dm_db_partition_stats AS sWHERE s.OBJECT_ID = OBJECT_ID('Table1') AND s.Index_ID < 2 Ooops. Sorry. It is for SQL 2005.Peter LarssonHelsingborg, Sweden |
 |
|
Wang
Starting Member
48 Posts |
Posted - 2007-03-29 : 10:25:35
|
can you not get this info in GB from looking at the diskspace of the database on the lan....and ignore (or infer by working backwards) the number of records causing it?not really, the fragmentation could be quite extensive, and the usage of the varchars may change dramatically with usage. As it stands I use both the rowcount and the size, all derived from sysindexes currently. As I say, the problem is resolved using sysindexes.rowcnt rather than sysindexes.rows.Ta to all. |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-03-29 : 10:35:27
|
Just an fyi: if you ever need to do any aggregate functions on this table, you might want to have a look here:http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=75175I got burned on taking an avg of a tinyint column on a table like this. Not quite as big as this one, but big enough to cause trouble.  www.elsasoft.org |
 |
|
Wang
Starting Member
48 Posts |
Posted - 2007-03-29 : 10:52:22
|
Hehe yes, I had a similar(ish) experience with checksums... trying to get the checksums of objects from the text in syscomments so I could easily establish if bobs.proc was the same as jills.proc.The other interesting thing was the accuracy of deep decimals. Say you want to measure something quite accurately - say 17dp... then multiply or divide by another 17dp value. Suddenly you lose a bunch of dp without noticing. Again, this is explained in BOL - but only if you are looking for it, and can understand what it means!As it happens I can't do any aggregate functions on it, as it tends to knock the system over if I do. :s |
 |
|
|