Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SQL Server Administration (2008)
 Memory lacking causing disk issues?

Author  Topic 

shifty1981
Starting Member

20 Posts

Posted - 2011-01-07 : 09:45:46
I have a sneaky suspicion that our disk stats are poor because of excessive paging due to a lack of enough memory. Below are some stats. Is there a way to prove that paging activity is the cause of excessive disk activity and if so, how do I determine how much more ram to buy. We're running SQL 2008 R2 x64 Enterprise on Windows Server 2008 R2 x64. We have 64 GB ram and are a high I/O OLTP database environment. The previous admin gave 32GB of the RAM to SQL and left 32GB to the OS and others presumably to help with the handling of all the customer files they send us (we backup and restore customer files all day long)

The only reason I'm not convinced already is because the disk (RAID1 set) that has the pagefile and OS on it has decent stats. It's the RAID sets with the DB on them that perform poorly. Maybe there's another stat I should be looking at to help determine this?


Average disk sec/Trans
P drive (RAID10 - 14 drives) 0.014
G drive (RAID10 - 14 drives) 0.014
K drive (where DB is, RAID10 - 14 drives) 0.041
F drive (where tempdb, tlog are, RAID10 - 4 drives) 0.007
C drive (OS, pagefile, RAID1) 0.005


Average Disk Queue Length
P drive (RAID10 - 14 drives) 0.68
G drive (RAID10 - 14 drives) 6.02
K drive (where DB is, RAID10 - 14 drives) 71.22
F drive (where tempdb, tlog are, RAID10 - 4 drives) 2.57
C drive (OS, pagefile, RAID1) 0.04

% DiskTime
P drive 67.6
G drive (RAID10 - 14 drives) 596.0
K drive (where DB is, RAID10 - 14 drives) 7358.7
F drive (where tempdb, tlog are, RAID10 - 4 drives) 261.0
C drive (OS, pagefile, RAID1) 5.6

Other stats
Checkpoint pages/sec 1246.618
Page File Usage 0.739
Page life expectancy 429.733
Page reads/sec (SQL:BM) 1672.008
Page Splits/sec 65.997
Page writes/sec (SQL:BM) 1371.892
Pages/sec 534.441
Lazy writes/sec 27.735
Target Server Memory (KB) 33554432
Total Server Memory (KB) 33554432
Total Available Memory (MBytes) 13166.208
Virtual Memory 58881281741
Full scans 77.165
Batch Requests/sec 643.659
Latch waits 1238.704

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-01-07 : 11:07:04
Your page life expectancy is extremely low for a system with that much memory. I'd highly recommend allowing SQL Server to use more of that 64GB. I think I'd bump it up to 58GB.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-01-07 : 11:07:36
On one of my systems that has 48GB of memory, I've seen the page life expectancy at 40,000.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-07 : 11:37:20
quote:
Steve Jones - SSC Editor (1/7/2011)
I might check on a few things here.

1. Check the buffer cache hit ratio, and see if it is high. Below 90%, I'd definitely be thinking memory.
2. Check page life expectency. If it is less than minutes, pages are being tossed out of the buffer to make room for new pages.

A few links
- [url]http://msmvps.com/blogs/bradley/archive/2009/01/15/how-to-troubleshoot-sql-server-memory-related-issues-part-1.aspx[/url]
- [url]http://sqlblog.com/blogs/lara_rubbelke/archive/2008/04/18/memory-pressure-on-64-bit-sql-server-2005.aspx[/url]
- [url]http://www.sqlservercentral.com/articles/Performance+Tuning/analyzingmemoryrequirementsforsqlserver/2498/[/url]


I'm not sure i understand. Most of our servers like this average between 1000 and 1500 page life expectancy. Some as low as 150 and some in the 3000-4000 range. We are a very high IO environment with lots of small bits of data. I have to explain in technical detail to my boss why it's worth spending thousands on more RAM.

The BCH ratio is pretty good: (98% on average)

below are all stats we're collecting besides disk stats

Batch Requests/sec 598.562
Buffer cache hit ratio 98.967
Checkpoint pages/sec 1627.684
Create-Open File Failure Events 32.614
Current Connections Counter 5.425
DeletedVersions Row Count 19871.55
FileInfo Row Count 76621872.08
FOREIGN KEY Events 0.258
Full scans 108.559
Latch waits 1343.628
Lazy writes/sec 14.55
LOCK Errors 1.381
Page File Usage 0.72
Page life expectancy 404.392
Page reads/sec (SQL:BM) 2074.087
Page Splits/sec 84.822
Page writes/sec (SQL:BM) 1572.879
Pages/sec 512.189
Target Server Memory (KB) 33554432
Total Available Memory (MBytes) 13456.167
Total Server Memory (KB) 33554432
User Connections 638.817
VersionFileInfo Row Count 2770363000
VersionFileLocalName Row Count 319386266.6
Virtual Memory 58524989850


The previous admin took the 64GB ram and gave 32 to SQL and the rest is for apps and data, etc. I need to determine if the rest really needs 32 or if we can give it less. Any idea how to determine that? is availableMBytes showing what of the 32 is left since sql is taking all of it's 32?
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-01-07 : 12:07:06
What else do you have running on the database server?

SQL will almost always consume all of its allowed memory. SQL is a memory hog.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-07 : 14:50:31
quote:
Originally posted by tkizer

What else do you have running on the database server?

SQL will almost always consume all of its allowed memory. SQL is a memory hog.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog



we have tools and apps that do need memory. see attached screenshot of a "downtime" list of processes.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-01-07 : 14:58:42
According to your perf counters, you've got 13GB of memory free. So why not give SQL some/most of that? Like an additional 10GB.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-01-07 : 15:07:17
And make sure lock pages in memory is granted to the sql service account.
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2011-01-07 : 15:11:51
Also, why in the world is firefox running on your sql server?!

And that many instances of SSMS? Not a good idea to use a production server as a workstation.

Regarding PLE -- you say as low as 150...that is an absolute certainty that you have memory pressure if it's an OLTP system.

Like Tara, ours are consistently much much higher. Even 1500 is a bad sign.
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2011-01-07 : 15:48:29
Have you considered that you might have poor performing disk arrays?

Have you done any disk performance testing with SQLIO to benchmark the disk performance? I alway benchmark the server disk IO for each drive before putting the server into production. I never assume that good sounding specs for the disk array lead to good performance; I have seen too many exceptions to belive anything except actual performance testing.










CODO ERGO SUM
Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-20 : 22:16:23
Hi Everyone, sorry for the delayed response. Was a way for a while and then busy. We were able to give it 32GB memory more. Below are some of the stats it now has:

This is just for last night during peak time (7pm to 5am):

Average Latch Wait Time (ms) 51.227
Average Wait Time (ms) 4567.500
Batch Requests/sec 499.314
BLBD - DB total (KB) 1560882176.000
BLBD - Log File (KB) 32506872.000
BLBD - Log File Used (KB) 348793.575
BLBD - Log Growths 31.000
BLBD - Percent Log Used 0.533
BLBD - Transactions/sec 145.687
Buffer cache hit ratio 99.119
Checkpoint pages/sec 697.022
Full scans 40.478
Latch waits 1540.848
Lazy writes/sec 10.001
Lock Blocks 93386.325
Lock Blocks Allocated 18677647.358
Lock Memory (KB) 13295413.733
Lock Requests/sec 1703278.310
Lock Timeouts/sec 50.797
Lock Wait Time (ms) 15231.991
Lock Waits/sec 3.375
Number of Deadlocks/sec 0.000
Page File Size - F 103068819456.000
Page File Usage % 0.258
Page life expectancy 211.008
Page Reads/sec 528.671
Page reads/sec (SQL:BM) 6062.149
Page Splits/sec 56.353
Page Writes/sec 0.000
Page writes/sec (SQL:BM) 768.543
Pages Input/sec 535.825
Pages/sec 535.825
SQL Cache Memory (KB) 2078.800
System Up Time (sec) 792615.018
Target Server Memory (KB) 67108864.000
TEMPDB - Data total (KB) 20739136.000
TEMPDB - Log File (KB) 1108792.000
TEMPDB - Log File Used (KB) 308513.425
TEMPDB - Log Growths 71.000
TEMPDB - Percent Log Used 27.333
Total Available Memory (MB) 18428.508
Total Server Memory (KB) 67108864.000
Virtual Memory (bytes) 85202730222.933
Go to Top of Page

sodeep
Master Smack Fu Yak Hacker

7174 Posts

Posted - 2011-01-21 : 13:19:19
quote:
Originally posted by shifty1981

Hi Everyone, sorry for the delayed response. Was a way for a while and then busy. We were able to give it 32GB memory more. Below are some of the stats it now has:

This is just for last night during peak time (7pm to 5am):

Average Latch Wait Time (ms) 51.227
Average Wait Time (ms) 4567.500 High Batch Requests/sec 499.314
BLBD - DB total (KB) 1560882176.000
BLBD - Log File (KB) 32506872.000
BLBD - Log File Used (KB) 348793.575
BLBD - Log Growths 31.000
BLBD - Percent Log Used 0.533
BLBD - Transactions/sec 145.687
Buffer cache hit ratio 99.119
Checkpoint pages/sec 697.022 High
Full scans 40.478 Too High
Latch waits 1540.848
Lazy writes/sec 10.001
Lock Blocks 93386.325
Lock Blocks Allocated 18677647.358
Lock Memory (KB) 13295413.733
Lock Requests/sec 1703278.310
Lock Timeouts/sec 50.797
Lock Wait Time (ms) 15231.991
Lock Waits/sec 3.375
Number of Deadlocks/sec 0.000
Page File Size - F 103068819456.000
Page File Usage % 0.258
Page life expectancy 211.008 too low
Page Reads/sec 528.671 high
Page reads/sec (SQL:BM) 6062.149
Page Splits/sec 56.353
Page Writes/sec 0.000
Page writes/sec (SQL:BM) 768.543
Pages Input/sec 535.825
Pages/sec 535.825 too high
SQL Cache Memory (KB) 2078.800
System Up Time (sec) 792615.018
Target Server Memory (KB) 67108864.000
TEMPDB - Data total (KB) 20739136.000
TEMPDB - Log File (KB) 1108792.000
TEMPDB - Log File Used (KB) 308513.425
TEMPDB - Log Growths 71.000
TEMPDB - Percent Log Used 27.333
Total Available Memory (MB) 18428.508
Total Server Memory (KB) 67108864.000
Virtual Memory (bytes) 85202730222.933



Are you running Apps beside SQL Server???? Can you also put CPU Counters? Looks like you need to tune Queries
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-01-21 : 13:46:06
The page life expectancy number is extremely low as was previously mentioned. It needs to be in the thousands.

Please post information about your IO, specifically Logical Disk\Avg Disk/sec Read and Logical Disk\Avg Disk/sec Write. Capture this data during a busy time. We are interested in the average over a period of time. Values should be 12ms or under in general, although occasional spikes are fine. It just can't be sustained over 12ms for any period of time.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-21 : 15:51:35
We do run our app that the customer client software talks to on the same server. I have suggested separating these but they don't have the budget for, nor have ever tested how that would work, code wise.

Anyway. What perfmon counter specifically are you suggesting I monitor?



quote:
Originally posted by sodeep

quote:
Originally posted by shifty1981

Hi Everyone, sorry for the delayed response. Was a way for a while and then busy. We were able to give it 32GB memory more. Below are some of the stats it now has:

This is just for last night during peak time (7pm to 5am):

Average Latch Wait Time (ms) 51.227
Average Wait Time (ms) 4567.500 High Batch Requests/sec 499.314
BLBD - DB total (KB) 1560882176.000
BLBD - Log File (KB) 32506872.000
BLBD - Log File Used (KB) 348793.575
BLBD - Log Growths 31.000
BLBD - Percent Log Used 0.533
BLBD - Transactions/sec 145.687
Buffer cache hit ratio 99.119
Checkpoint pages/sec 697.022 High
Full scans 40.478 Too High
Latch waits 1540.848
Lazy writes/sec 10.001
Lock Blocks 93386.325
Lock Blocks Allocated 18677647.358
Lock Memory (KB) 13295413.733
Lock Requests/sec 1703278.310
Lock Timeouts/sec 50.797
Lock Wait Time (ms) 15231.991
Lock Waits/sec 3.375
Number of Deadlocks/sec 0.000
Page File Size - F 103068819456.000
Page File Usage % 0.258
Page life expectancy 211.008 too low
Page Reads/sec 528.671 high
Page reads/sec (SQL:BM) 6062.149
Page Splits/sec 56.353
Page Writes/sec 0.000
Page writes/sec (SQL:BM) 768.543
Pages Input/sec 535.825
Pages/sec 535.825 too high
SQL Cache Memory (KB) 2078.800
System Up Time (sec) 792615.018
Target Server Memory (KB) 67108864.000
TEMPDB - Data total (KB) 20739136.000
TEMPDB - Log File (KB) 1108792.000
TEMPDB - Log File Used (KB) 308513.425
TEMPDB - Log Growths 71.000
TEMPDB - Percent Log Used 27.333
Total Available Memory (MB) 18428.508
Total Server Memory (KB) 67108864.000
Virtual Memory (bytes) 85202730222.933



Are you running Apps beside SQL Server???? Can you also put CPU Counters? Looks like you need to tune Queries

Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-01-21 : 16:14:42
My post indicates what IO counters to focus on. Regarding CPU, it's the total processor CPU one that is automatically highlighted when you first open up the counters.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-21 : 16:21:14
Our Avg Disk sec/Trans varies from server to server and raid set to raid set. for the pieces of our db that are high we're in the 50's to 600 depending on the server. This particular one it's around 100. We have 15K sas disks 6gbps. the server is maxes out for hard drives so the only other option we have for storage is to move the DB to DAS or SAN storage.

As for page life expectancy, my understanding was that above 300 is the goal. this article discusses that. What are you referencing to think it needs to be in the thousands?

http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/

Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-21 : 16:22:32
also, the server is not being used as a workstation. these servers are remote and we have various admins doing various tasks. so they RDP to the server and use management studio to do things. I started here recently so I'm guessing someone installed firefox to fix some issue that IE was the culprit of. No other server has firefox on it.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-01-21 : 16:24:01
I'm referring to the industry experts regarding the PLE needing to be in the thousands!

You didn't look at the right PerfMon Logical Disk counters. We need to see "Avg. Disk sec/Read" and "Avg. Disk sec/Write". We only care about the counters really for the disks that have SQL on them.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-22 : 08:32:25
On this server, ADs/R is about 45 milliseconds, ADs/W is about 8 ms.

No offense but can you point me to articles explaining why such a high PLE is the goal? I have seen articles suggesting much lower numbers and so has the ceo, so if I'm going to end up proposing certain changes/upgrades I'll need to back it up somehow.

Thanks!

quote:
Originally posted by tkizer

I'm referring to the industry experts regarding the PLE needing to be in the thousands!

You didn't look at the right PerfMon Logical Disk counters. We need to see "Avg. Disk sec/Read" and "Avg. Disk sec/Write". We only care about the counters really for the disks that have SQL on them.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog

Go to Top of Page

shifty1981
Starting Member

20 Posts

Posted - 2011-01-22 : 08:37:38
I just checked and on all the servers with the same hardware config, the ADs/R is always high like the ADs/T and sometimes the ADs/W is high but 1/2 the time not.
Go to Top of Page

Sachin.Nand

2937 Posts

Posted - 2011-01-23 : 08:00:10
Batch Requests/sec 499.314 does not seem to be that much of a busy server considering the fact that you mentioned it is a peak time.

Also I noticed Page Writes/sec 0.000.It seems there are no updates happening in the DB.

PBUH

Go to Top of Page
    Next Page

- Advertisement -