I've been following the discussions about these options for a long time. When it seems that everybody agrees on something, somebody else comes with a different understanding and the discussion starts all over again... The questions at the beginning of each thread about this subject are often the same:
- When should I use /3GB option, and when should I not use it?
- Why should I have /PAE option in place?
- What about /PAE and /3GB? Should they be used together? Or, can't they be used together?
- Are there any performance or reliability implications when using these options?
- Where AWE come from?
I understand you don't have to be an expert to perfectly understand these options and when they apply. Let's try applying the KIS (Keep It Simple) technique here, and see how all these options work in a very basic, or simple, way. But first let's make sure some basic concepts, very important to understand those options, are covered:
A very simple vision of Processes, Applications and Virtual Memory
Imagine a classroom with any number of students. This classroom have a set of minimum resources like chairs, desks and notebooks for the students, a whiteboard etc.
The Processes are what is represented by the classroom to the students in this example, or a space shared by them and their resources. The space occupied by the students and the resources within the classroom is limited, as it is the Virtual Address Space of a Process, or it's Virtual Memory. While we measure the space limit of a real classroom in square feet, we do for a process and its virtual address in GBytes with the difference that classrooms can be any size and a process virtual address space is always 4GBytes large when working with 32 bit architecture - 2^32 = 4G.
The Applications will be the students in the classroom. You can have multiple students sharing the same classroom, so you can have multiple Applications hosted by the same process, or in other words, sharing the same virtual address space. Actually, for instance when a process is created to host a Win32 app, it will also host at least the application responsible for providing the interface to the executive (ntdll.dll) and one of the applications which provide the Win32 APIs (kernel32.dll). So, when we start up a very simple application, its process will also host at least 2 additional other applications.
The Virtual Address Space of each process is divided in two regions: The User Mode where the mentioned applications will run and the Kernel Mode which will be a representation of an address space that is exclusively accessed by the OS kernel. By default the distribution is 50/50 or 2GBytes being available to all User Mode apps sharing the same address space, or being hosted by the same process, and 2GBytes for the kernel. Back to our example, suppose our classroom has its available space reduced to half of the total size. The students can not access the other half.
Now, direct to the point:
The /3GB boot.ini option or 4GT Ram tuning as it's called, will change the way the virtual address space is divided between the kernel and the user mode application on each process. As said before, by default, when not using /3GB option, the division is 50/50. When using the /3GB though, the division becomes 25/75 so the Kernel will reduce its virtual address space to 1GB and the address space used by User Mode apps will earn the additional 1GB so becoming 3GB. Let's take a look at the picture below for a better illustration on what happens when we're not using the /3GB option:
The 4 billion 32 bit addresses in hex format will go from 0x00000000 to 0xFFFFFFFF. When not using the /3GB option, the line which separates the Kernel and User Mode addresses is placed in the middle, at the address 0x7FFFFFFF in a way that the User Mode addresses are from 0x00000000 to 0x7FFFFFFF and the Kernel Mode addresses start at 0x80000000. Now, let’s look on what happens when we do have the /3GB option in place:
As we can see the line was lowered and the addresses scheme for both User and Kernel Mode changed. Now, the addresses used by User Mode apps are from 0x0000000 to 0xBFFFFFFFF and the ones used by the Kernel are from 0xC0000000 to 0xFFFFFFFF.
However the fact that the address space accessible by the User Mode applications is now larger might not represent any advantage to the applications. The applications will need to be aware of this larger address space. They need to be designed to make use that. The applications need to be compiled with a special link option called IMAGE_FILE_LARGE_ADDRESS_AWARE otherwise they won't use the additional 1GB of virtual memory. The tool dumpbin.exe from the Visual Studio can tell, when using the flag /HEADERS (dumpbin.exe /HEADERS), if the application has this link option enabled or not.
In the other side, the 1GB the Kernel needed to give up from, might be missed because the memory reduction affects key memory areas like, for instance, the file system cache (it directly impacts the paged and non-paged pool sizes and drastically reduces the number of system PTEs - but this is outside the scope of this article) and this may impact the reliability and performance of the entire system.
The conclusion is that the only general recommendation for the usage of /3GB option is do not use it unless it's required by the vendor of the application the sever is dedicated to. There are, for instance, specific recommendations for Domain Controllers, SQL Servers, Exchange Server etc, but no general rule for using it. Also, make sure the rules of your server do not conflict regarding the /3GB recommendation. For instance, if you have a SQL Server and File Server running in the same box, you will need to deal with the conflict that for a SQL Server the use of /3GB is recommended however for a File Server it is recommended to do not use it.
Now that we know where the 4GBytes limit comes from (32-bit procs -> 2^32 = 4G) we also understand that the OS kernel itself will not be able to see beyond that either. Something needed to change at the processors in order to be able to address beyond the 4GBytes. The Physical Address Extension (PAE) is an extension provided by intel (the 6th feature, or the bit 6 = 1 in the feature flags of the processor) to enable 32-bit processors to support more than 4GBytes of RAM. Looking at that in a very simple way, it makes the processors be able to address up to 64GBytes through providing a 36-bit physical addressing mode. So, with 36 bits available for physical addressing, the process is able to use 2^36, or 64 billion addresses, instead of the original 4 billion.
The OS kernel also needs to be aware of this since it will need to change its address translation scheme (PAE Kernel will introduce a third level - PDPE - in the address translation) to proper handle the memory beyond the 4GB. The /PAE option at the boot.ini will make the PAE kernel (ntkrnlpa.exe for single proc machines or ntkrnlpamp.exe for multiproc machines - SMP) to be loaded, instead of the regular one (ntoskrnl.exe).
Notice that PAE has no effect at the processes Virtual Address Space as they are still 4GB large regardless of the amount of physical memory in the system. So referencing back our classroom example, we could say that there are several classrooms (or processes) in our building (or running in the OS) and that the PAE will provide more space to the building
We know that PAE will make the processor be able to address 36 bits and the OS, if the correct kernel gets loaded (through the /PAE option), will be able to see beyond 4GB of physical memory. However we did not mention anything about how this will impact the application behaviors. The application are, by default, still on that same classroom with 2GB (or 3GB if 4GT is enabled) even with the PAE enabled. The Address Windowing Extension (AWE) is a set of extension APIs that the application need to use in order to make use of the same 32-bit large pointers to access physical memory beyond 4GB. In other words, if the application was not developed to take benefit of that, even on systems with way more memory than 4GB they will still be limited to 2GB (or 3GB if 4GT is enabled). The SQL Server is as example of application that uses AWE.
Putting all together:
The /3GB option is to be used whenever, and only, the application vendor requires or recommends it. The applications must to be compiled with the IMAGE_FILE_LARGE_ADDRESS_AWARE in order to take advantage of it and it has no necessary relationship with the amount of physical memory the system has available. However if the system has more than 4GB of physical RAM and it's not 64-bit, you will need to use the /PAE option to make the OS aware and use beyond the 4GB limit.
If you have the /PAE and /3GB used together, the system will limit the physical memory usage to 16GB regardless of the amount of memory beyond that (this is due to the exhaustion of kernel resources to do both addressing large amounts of memory and reduce it's virtual space to 1GB, but this is subject to another article :-)).
If you have more than 4GB of physical RAM, be aware that just enabling PAE will not make the applications to actually make usage of any additional memory unless they are properly designed for that (they must use the AWE special APIs for this).