An Overview of Troubleshooting Memory Issues – Part Two

In our last post, we looked at some common memory issues and how to troubleshoot them.  Today we’re going to go over excessive paging and memory bottlenecks.

We’ve talked about issues with the page file in several posts – something to bear in mind is that although you want to have enough RAM to prevent excessive paging, the aim should not be to try to prevent paging activity completely.  Some page fault behavior is inevitable – for example when a process is first initialized.  Modified virtual pages in memory have to be updated on the disk eventually, so there will be some amount of Page Writes /sec.  However, when there is not enough RAM installed, there are two issues in particular that you may see – too many page faults, and disk contention.

Let’s start with Page Faults.  Page faults are divided into two types, soft and hard.  A page fault occurs when a process requests a page in memory and the system cannot find the page at the requested location.  If the requested page is actually elsewhere in memory, then the fault is a soft page fault.  However, if the page has to be retrieved from the disk, then a hard fault occurs.  Most systems can handle soft page faults with no issues.  However, if there are lots of hard page faults you may experience delays.  The additional disk I/O resulting from constantly paging to disk can interfere with applications that are trying to access data stored on the same disk as the page file.  Although high page faults on a system is a fairly straightforward issue, it requires some extensive data gathering and analysis in Performance Monitor.  The counters below are the important ones when troubleshooting a suspected page fault issue:

Counter Description Values to Consider
Memory \ Pages /sec Pages/sec is the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays.  It is the sum of Memory \ Pages Input/sec and Memory \ Pages Output/sec.  It is counted in numbers of pages, so it can be compared to other counts of pages, such as Memory \ Page Faults/sec, without conversion. It includes pages retrieved to satisfy faults in the file system cache (usually requested by applications) non-cached mapped memory files. If the Pages / sec multiplied by 4,000 (the 4k page size) is greater than 70% of the total number of Logical Disk Bytes / sec to the disk(s) where the page file is located on a consistent basis then you should investigate. 

Translation:  If paging to disk is > 70% of your total disk activity on a consistent basis then there may be an issue

Memory \ Page Reads /sec Page Reads/sec is the rate at which the disk was read to resolve hard page faults. It shows the number of reads operations, without regard to the number of pages retrieved in each operation. Hard page faults occur when a process references a page in virtual memory that is not in working set or elsewhere in physical memory, and must be retrieved from disk. This counter is a primary indicator of the kinds of faults that cause system-wide delays. It includes read operations to satisfy faults in the file system cache (usually requested by applications) and in non-cached mapped memory files. Compare the value of Memory \ Pages Reads/sec to the value of Memory \ Pages Input/sec to determine the average number of pages read during each operation. Look for sustained values.  If the value is consistently greater than 50% of the total number of Logical Disk operations to the disk where the page file resides, then there is an inordinate amount of paging taking place to resolve hard faults.
Memory \ Available Bytes Available Bytes is the amount of physical memory, in bytes, immediately available for allocation to a process or for system use. It is equal to the sum of memory assigned to the standby (cached), free and zero page lists. If this value falls below 5% of installed RAM on a consistent basis, then you should investigate.  If the value drops below 1% of installed RAM on a consistent basis, there is a definite problem!

Remember that since the operating system has to write changed pages to disk, that there will be page write operations occurring.  However, the Page Reads /sec which indicates the number of hard faults is extremely sensitive to situations with insufficient RAM.  As the value of Available Bytes decreases, the number of hard Page Faults will normally increase.  The total number of Pages /sec that can be sustained by the system is a function of the disk bandwidth.  This does however mean that there is no simple number to determine whether or not the disks are saturated.  Instead you have to identify how much of the overall disk traffic is being caused by paging activity.

Another indicator of a memory bottleneck is that the pool of Available Bytes is depleted.  Page trimming by the Virtual Memory Manager is triggered when there is a shortage of available bytes.  What page trimming does is attempt to replenish the pool of available bytes by identifying virtual memory pages that have not been referenced recently.  When page trimming is effective, older pages trimmed from the process working sets are not needed again soon.  Trimmed pages are marked in transition and remain in RAM for a period of time to reduce the amount of paging to disk that occurs.  However, if there is a chronic shortage of available bytes, then page trimming is less effective and the result is that there is more paging to disk.  Since there is little room in RAM for the pages that are marked in transition, if a recently trimmed page is referenced again it has to be accessed from disk as opposed to RAM.  The more severe the bottleneck, the more often the page file is updated – which interferes with application-directed I/O operations on the same disk.

Before we wrap up, let’s quickly discuss the guideline listed above when looking at the Memory \ Available Bytes counter.  Normally if the Available Bytes is greater than 5% of the installed RAM consistently, then you should be in decent shape.  However, there are some applications that can manage their own working sets – IIS6, Exchange Server and SQL Server.  These applications interact with the virtual memory manager to increase their working sets if there is memory available and should trim their working sets when signaled by the operating system.  The applications rely on RAM-resident cache buffers to reduce the I/O to disk.  Thus, RAM will always look full as a result.

And on that note, we will wrap up our two-part Overview of Troubleshooting Memory Issues.  Until next time …

CC Hameed

Share this post :