Author |
Topic |
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-07 : 09:45:46
|
I have a sneaky suspicion that our disk stats are poor because of excessive paging due to a lack of enough memory. Below are some stats. Is there a way to prove that paging activity is the cause of excessive disk activity and if so, how do I determine how much more ram to buy. We're running SQL 2008 R2 x64 Enterprise on Windows Server 2008 R2 x64. We have 64 GB ram and are a high I/O OLTP database environment. The previous admin gave 32GB of the RAM to SQL and left 32GB to the OS and others presumably to help with the handling of all the customer files they send us (we backup and restore customer files all day long)The only reason I'm not convinced already is because the disk (RAID1 set) that has the pagefile and OS on it has decent stats. It's the RAID sets with the DB on them that perform poorly. Maybe there's another stat I should be looking at to help determine this?Average disk sec/TransP drive (RAID10 - 14 drives) 0.014G drive (RAID10 - 14 drives) 0.014K drive (where DB is, RAID10 - 14 drives) 0.041F drive (where tempdb, tlog are, RAID10 - 4 drives) 0.007C drive (OS, pagefile, RAID1) 0.005Average Disk Queue LengthP drive (RAID10 - 14 drives) 0.68G drive (RAID10 - 14 drives) 6.02K drive (where DB is, RAID10 - 14 drives) 71.22F drive (where tempdb, tlog are, RAID10 - 4 drives) 2.57C drive (OS, pagefile, RAID1) 0.04% DiskTime P drive 67.6G drive (RAID10 - 14 drives) 596.0K drive (where DB is, RAID10 - 14 drives) 7358.7F drive (where tempdb, tlog are, RAID10 - 4 drives) 261.0C drive (OS, pagefile, RAID1) 5.6Other statsCheckpoint pages/sec 1246.618 Page File Usage 0.739 Page life expectancy 429.733 Page reads/sec (SQL:BM) 1672.008 Page Splits/sec 65.997 Page writes/sec (SQL:BM) 1371.892 Pages/sec 534.441 Lazy writes/sec 27.735Target Server Memory (KB) 33554432Total Server Memory (KB) 33554432Total Available Memory (MBytes) 13166.208Virtual Memory 58881281741Full scans 77.165Batch Requests/sec 643.659Latch waits 1238.704 |
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-07 : 11:37:20
|
quote: Steve Jones - SSC Editor (1/7/2011) I might check on a few things here.1. Check the buffer cache hit ratio, and see if it is high. Below 90%, I'd definitely be thinking memory.2. Check page life expectency. If it is less than minutes, pages are being tossed out of the buffer to make room for new pages.A few links- [url]http://msmvps.com/blogs/bradley/archive/2009/01/15/how-to-troubleshoot-sql-server-memory-related-issues-part-1.aspx[/url]- [url]http://sqlblog.com/blogs/lara_rubbelke/archive/2008/04/18/memory-pressure-on-64-bit-sql-server-2005.aspx[/url]- [url]http://www.sqlservercentral.com/articles/Performance+Tuning/analyzingmemoryrequirementsforsqlserver/2498/[/url]
I'm not sure i understand. Most of our servers like this average between 1000 and 1500 page life expectancy. Some as low as 150 and some in the 3000-4000 range. We are a very high IO environment with lots of small bits of data. I have to explain in technical detail to my boss why it's worth spending thousands on more RAM. The BCH ratio is pretty good: (98% on average)below are all stats we're collecting besides disk statsBatch Requests/sec 598.562Buffer cache hit ratio 98.967Checkpoint pages/sec 1627.684Create-Open File Failure Events 32.614Current Connections Counter 5.425DeletedVersions Row Count 19871.55FileInfo Row Count 76621872.08FOREIGN KEY Events 0.258Full scans 108.559Latch waits 1343.628Lazy writes/sec 14.55LOCK Errors 1.381Page File Usage 0.72Page life expectancy 404.392Page reads/sec (SQL:BM) 2074.087Page Splits/sec 84.822Page writes/sec (SQL:BM) 1572.879Pages/sec 512.189Target Server Memory (KB) 33554432Total Available Memory (MBytes) 13456.167Total Server Memory (KB) 33554432User Connections 638.817VersionFileInfo Row Count 2770363000VersionFileLocalName Row Count 319386266.6Virtual Memory 58524989850The previous admin took the 64GB ram and gave 32 to SQL and the rest is for apps and data, etc. I need to determine if the rest really needs 32 or if we can give it less. Any idea how to determine that? is availableMBytes showing what of the 32 is left since sql is taking all of it's 32? |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
shifty1981
Starting Member
20 Posts |
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2011-01-07 : 15:07:17
|
And make sure lock pages in memory is granted to the sql service account. |
|
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2011-01-07 : 15:11:51
|
Also, why in the world is firefox running on your sql server?!And that many instances of SSMS? Not a good idea to use a production server as a workstation.Regarding PLE -- you say as low as 150...that is an absolute certainty that you have memory pressure if it's an OLTP system.Like Tara, ours are consistently much much higher. Even 1500 is a bad sign. |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2011-01-07 : 15:48:29
|
Have you considered that you might have poor performing disk arrays?Have you done any disk performance testing with SQLIO to benchmark the disk performance? I alway benchmark the server disk IO for each drive before putting the server into production. I never assume that good sounding specs for the disk array lead to good performance; I have seen too many exceptions to belive anything except actual performance testing.CODO ERGO SUM |
|
|
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-20 : 22:16:23
|
Hi Everyone, sorry for the delayed response. Was a way for a while and then busy. We were able to give it 32GB memory more. Below are some of the stats it now has: This is just for last night during peak time (7pm to 5am): Average Latch Wait Time (ms) 51.227 Average Wait Time (ms) 4567.500 Batch Requests/sec 499.314 BLBD - DB total (KB) 1560882176.000 BLBD - Log File (KB) 32506872.000 BLBD - Log File Used (KB) 348793.575 BLBD - Log Growths 31.000 BLBD - Percent Log Used 0.533 BLBD - Transactions/sec 145.687 Buffer cache hit ratio 99.119 Checkpoint pages/sec 697.022 Full scans 40.478 Latch waits 1540.848 Lazy writes/sec 10.001 Lock Blocks 93386.325 Lock Blocks Allocated 18677647.358 Lock Memory (KB) 13295413.733 Lock Requests/sec 1703278.310 Lock Timeouts/sec 50.797 Lock Wait Time (ms) 15231.991 Lock Waits/sec 3.375 Number of Deadlocks/sec 0.000 Page File Size - F 103068819456.000 Page File Usage % 0.258 Page life expectancy 211.008 Page Reads/sec 528.671 Page reads/sec (SQL:BM) 6062.149 Page Splits/sec 56.353 Page Writes/sec 0.000 Page writes/sec (SQL:BM) 768.543 Pages Input/sec 535.825 Pages/sec 535.825 SQL Cache Memory (KB) 2078.800 System Up Time (sec) 792615.018 Target Server Memory (KB) 67108864.000 TEMPDB - Data total (KB) 20739136.000 TEMPDB - Log File (KB) 1108792.000 TEMPDB - Log File Used (KB) 308513.425 TEMPDB - Log Growths 71.000 TEMPDB - Percent Log Used 27.333 Total Available Memory (MB) 18428.508 Total Server Memory (KB) 67108864.000 Virtual Memory (bytes) 85202730222.933 |
|
|
sodeep
Master Smack Fu Yak Hacker
7174 Posts |
Posted - 2011-01-21 : 13:19:19
|
quote: Originally posted by shifty1981 Hi Everyone, sorry for the delayed response. Was a way for a while and then busy. We were able to give it 32GB memory more. Below are some of the stats it now has: This is just for last night during peak time (7pm to 5am): Average Latch Wait Time (ms) 51.227 Average Wait Time (ms) 4567.500 High Batch Requests/sec 499.314 BLBD - DB total (KB) 1560882176.000 BLBD - Log File (KB) 32506872.000 BLBD - Log File Used (KB) 348793.575 BLBD - Log Growths 31.000 BLBD - Percent Log Used 0.533 BLBD - Transactions/sec 145.687 Buffer cache hit ratio 99.119 Checkpoint pages/sec 697.022 High Full scans 40.478 Too High Latch waits 1540.848 Lazy writes/sec 10.001 Lock Blocks 93386.325 Lock Blocks Allocated 18677647.358 Lock Memory (KB) 13295413.733 Lock Requests/sec 1703278.310 Lock Timeouts/sec 50.797 Lock Wait Time (ms) 15231.991 Lock Waits/sec 3.375 Number of Deadlocks/sec 0.000 Page File Size - F 103068819456.000 Page File Usage % 0.258 Page life expectancy 211.008 too low Page Reads/sec 528.671 high Page reads/sec (SQL:BM) 6062.149 Page Splits/sec 56.353 Page Writes/sec 0.000 Page writes/sec (SQL:BM) 768.543 Pages Input/sec 535.825 Pages/sec 535.825 too high SQL Cache Memory (KB) 2078.800 System Up Time (sec) 792615.018 Target Server Memory (KB) 67108864.000 TEMPDB - Data total (KB) 20739136.000 TEMPDB - Log File (KB) 1108792.000 TEMPDB - Log File Used (KB) 308513.425 TEMPDB - Log Growths 71.000 TEMPDB - Percent Log Used 27.333 Total Available Memory (MB) 18428.508 Total Server Memory (KB) 67108864.000 Virtual Memory (bytes) 85202730222.933
Are you running Apps beside SQL Server???? Can you also put CPU Counters? Looks like you need to tune Queries |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2011-01-21 : 13:46:06
|
The page life expectancy number is extremely low as was previously mentioned. It needs to be in the thousands.Please post information about your IO, specifically Logical Disk\Avg Disk/sec Read and Logical Disk\Avg Disk/sec Write. Capture this data during a busy time. We are interested in the average over a period of time. Values should be 12ms or under in general, although occasional spikes are fine. It just can't be sustained over 12ms for any period of time.Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Subscribe to my blog |
|
|
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-21 : 15:51:35
|
We do run our app that the customer client software talks to on the same server. I have suggested separating these but they don't have the budget for, nor have ever tested how that would work, code wise. Anyway. What perfmon counter specifically are you suggesting I monitor?quote: Originally posted by sodeep
quote: Originally posted by shifty1981 Hi Everyone, sorry for the delayed response. Was a way for a while and then busy. We were able to give it 32GB memory more. Below are some of the stats it now has: This is just for last night during peak time (7pm to 5am): Average Latch Wait Time (ms) 51.227 Average Wait Time (ms) 4567.500 High Batch Requests/sec 499.314 BLBD - DB total (KB) 1560882176.000 BLBD - Log File (KB) 32506872.000 BLBD - Log File Used (KB) 348793.575 BLBD - Log Growths 31.000 BLBD - Percent Log Used 0.533 BLBD - Transactions/sec 145.687 Buffer cache hit ratio 99.119 Checkpoint pages/sec 697.022 High Full scans 40.478 Too High Latch waits 1540.848 Lazy writes/sec 10.001 Lock Blocks 93386.325 Lock Blocks Allocated 18677647.358 Lock Memory (KB) 13295413.733 Lock Requests/sec 1703278.310 Lock Timeouts/sec 50.797 Lock Wait Time (ms) 15231.991 Lock Waits/sec 3.375 Number of Deadlocks/sec 0.000 Page File Size - F 103068819456.000 Page File Usage % 0.258 Page life expectancy 211.008 too low Page Reads/sec 528.671 high Page reads/sec (SQL:BM) 6062.149 Page Splits/sec 56.353 Page Writes/sec 0.000 Page writes/sec (SQL:BM) 768.543 Pages Input/sec 535.825 Pages/sec 535.825 too high SQL Cache Memory (KB) 2078.800 System Up Time (sec) 792615.018 Target Server Memory (KB) 67108864.000 TEMPDB - Data total (KB) 20739136.000 TEMPDB - Log File (KB) 1108792.000 TEMPDB - Log File Used (KB) 308513.425 TEMPDB - Log Growths 71.000 TEMPDB - Percent Log Used 27.333 Total Available Memory (MB) 18428.508 Total Server Memory (KB) 67108864.000 Virtual Memory (bytes) 85202730222.933
Are you running Apps beside SQL Server???? Can you also put CPU Counters? Looks like you need to tune Queries
|
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-21 : 16:21:14
|
Our Avg Disk sec/Trans varies from server to server and raid set to raid set. for the pieces of our db that are high we're in the 50's to 600 depending on the server. This particular one it's around 100. We have 15K sas disks 6gbps. the server is maxes out for hard drives so the only other option we have for storage is to move the DB to DAS or SAN storage. As for page life expectancy, my understanding was that above 300 is the goal. this article discusses that. What are you referencing to think it needs to be in the thousands?http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/ |
|
|
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-21 : 16:22:32
|
also, the server is not being used as a workstation. these servers are remote and we have various admins doing various tasks. so they RDP to the server and use management studio to do things. I started here recently so I'm guessing someone installed firefox to fix some issue that IE was the culprit of. No other server has firefox on it. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-22 : 08:32:25
|
On this server, ADs/R is about 45 milliseconds, ADs/W is about 8 ms. No offense but can you point me to articles explaining why such a high PLE is the goal? I have seen articles suggesting much lower numbers and so has the ceo, so if I'm going to end up proposing certain changes/upgrades I'll need to back it up somehow. Thanks!quote: Originally posted by tkizer I'm referring to the industry experts regarding the PLE needing to be in the thousands! You didn't look at the right PerfMon Logical Disk counters. We need to see "Avg. Disk sec/Read" and "Avg. Disk sec/Write". We only care about the counters really for the disks that have SQL on them.Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Subscribe to my blog
|
|
|
shifty1981
Starting Member
20 Posts |
Posted - 2011-01-22 : 08:37:38
|
I just checked and on all the servers with the same hardware config, the ADs/R is always high like the ADs/T and sometimes the ADs/W is high but 1/2 the time not. |
|
|
Sachin.Nand
2937 Posts |
Posted - 2011-01-23 : 08:00:10
|
Batch Requests/sec 499.314 does not seem to be that much of a busy server considering the fact that you mentioned it is a peak time.Also I noticed Page Writes/sec 0.000.It seems there are no updates happening in the DB.PBUH |
|
|
Next Page
|