Written by Benny Lakunishok, Senior Microsoft Premier Field Engineer.
As many know by now, in Exchange 2010, Microsoft managed to get disk I/O numbers so low that it’s able to run perfectly well on DAS (direct-attached storage) and even JBODs (i.e. DAS with no RAID configured). So how come we should still run Jetstress, the tool that helps verify the performance and stability of a disk subsystem prior to putting an Exchange server into production?
And an even weirder question: how come we see many SAN deployments fail the Jetstress tests when Exchange 2010 is able to run on what is perceived as a “lower” performing solution such as DAS?
Microsoft Exchange 2010’s New Disk I/O Profile
The change that Microsoft made in Exchange 2010 was indeed about getting to much lower I/O numbers in order to make sure that the product is able to run on cheaper storage solutions that enable businesses to have very big mailboxes (5-10 GB) without breaking the budget.
What people don’t know (or most people at least) is that Microsoft didn’t just change the quantity of the disk I/O profile (i.e. less I/O), it achieved this by also changing the “quality” so that more I/Os are sequential and each I/O is potentially much larger in size. These quantity and quality changes (that are tightly related to each other) is what enables us to have so many big mailboxes on such cost saving storage solutions like JBOD.
The Impact of the New I/O Profile on Storage
So all of this is good and well but how come my SAN can’t keep up with something that JBOD can do?
First of all: it can, but it will likely need some adjustments to accommodate the changed I/O profile and by that I mean the “quality” change of big sequential I/Os and not the “quantity” change.
As I like to explain things through examples and stories, I will tell you one now about a real Exchange 2010 project a customer of mine had. This customer was planning their new Exchange 2010 infrastructure with these requirements:
- 5,000 mailboxes
- 5GB each (they wanted to eliminate both PSTs and a 3rd party archive solution that drove costs up)
- Sending and receiving 200 messages per day
In order to accommodate to these requirements, they needed roughly 20 DBs each of 1.5TB in size and each DB needed about 50 IOPS (disk I/O operations per second) of transactional I/O (that is, I/O driven by user activity). In total were are talking about 1,000 IOPS and 30TB per server in the DAG. Note that other inputs and considerations in your particular design may result in different numbers for you.
What this customer did:
- They ordered 2 disk shelves per DAG server, each disk shelf had 14 disks and each disk was a 2TB 7,200 RPM SATA drive. This was a SAN. They figured that it should be more than enough for these mailboxes in terms of I/O and capacity.
Unfortunately, this customer made a mistake and their configuration failed the Jetstress test.
It was that at this point they came to me for the first time to help them figure out what was wrong with the Jetstress tool (interesting they thought the tool was broken, and not the storage configuration )
So after breathing in and out a few times (I do this at least once a week and it can get a bit tiring) I explained to them many things about Exchange 2010, and focused on where they went astray in their project. The parts that are most relevant to this article are:
- Firstly, they forgot about the BDM (Background Database Maintenance) that runs 24/7 all the time which has a throughput of roughly 5MB/Sec for each DB that is configured.
- Secondly, they didn’t configure a crucial storage tweak (especially for Exchange 2010) which is the stripe size. This is what makes sure that the new I/O pattern of larger and sequential I/O is handled correctly by the storage. In this case, they had 20 DBs and since each BDM will produce 5 MB/sec the total is about 100MB/sec per DAG server. This throughput is done in I/O chunks of 256KB which in our case resulted in about 400 IOPS needed just for BDM without the actual user I/O. Now their stripe size of 4KB meant that each of these I/Os was broken down to be serviced by 64 actual smaller I/Os. This means that in order to achieve these 400 non-transactional, large and sequential I/Os (that JBOD handles very well,I may add), they would need to plan for actually 25,600 IOPS just for BDM! Certainly something that the customer didn’t plan for when they thought about Exchange 2010 that can run on JBOD.
If they changed the stripe size to the recommended value of 256KB they would have probably passed the Jetstress test. But the specific vendor they selected didn’t have the option to change the stripe size.
After some discussions and time, this customer then tried running Jetstress with a similar set of disks (28 7,200 RPM SATA 2TB disks) that were connected directly to the server with a DAS configuration (instead of the SAN they had) and after configuring the stripe size the Jetstress test passed.
Bottom-Line: Don’t Ignore Disk Performance Considerations
Just because the amount of disk I/Os required in Exchange 2010 is lower than previous versions doesn’t mean that disk performance considerations and configuration should be taken for granted. The I/O profile in Exchange 2010 is significantly different then all previous versions of Exchange, with the purpose of being able to run successfully on very low cost storage solution, such as DAS with RAID and even without RAID (= JBOD).
If you’re not already, you should become familiar with the storage best practices for Exchange 2010 (search for the keywords: “Best practice” to see what is really recommended) and always make sure you check the performance of your new Exchange infrastructure with Jetstress before going into production.