Since Exchange 2010 launched I have had many discussions with customers over the use of archiving technology that removes data from Exchange and replaces it with a message stub. This type of technology is fairly common throughout the messaging world and used to be pretty much mandatory back with Exchange 2003 when storage was complex and expensive.
The Microsoft recommendation for Exchange Server 2010 is to use large mailboxes (10GB+) and to store this data on low cost storage. The I/O improvements in store.exe for Exchange Server 2010 allow the use of quite low end disk spindles and also storage solutions without raid given enough database availability group copies are deployed. This is obviously a fairly significant shift from the days of Exchange Server 2003 where we did not even have a robust HA technology, never mind data availability built in to Exchange!
So, my experience is that most customers buy into the large mailbox and cheap storage message pretty quickly and design their Exchange solutions accordingly. However, where things get less straightforward are those customers with a significant and potentially quite recent investment in archiving technology. These customers want to move to Exchange Server 2010 and bring their archiving technology with them…
The problem here is that Exchange Server 2010 received the largest change to the ESE store schema since Exchange 5.5, these changes encourage fewer, larger I/O’s rather than many smaller I/O’s to make deployment on SATA spindles viable, one of the many ways this was achieved was to increase the database page size to 32KB from 8KB in Exchange 2007 and 4KB in Exchange 2003. This change in database page size brought some changes to the behaviour of archive stubbing technology, chiefly the archiving process fails to recover the right amount of white space after removing data from the mailbox databases. James Carroll has written a great blog explaining this issue here,
- Exchange 2010 Database Page Fragmentation Caused By Archive Solutions
So… where does this leave us? The archiving software vendors are looking into this issue and are trying to resolve it – I would assume that they will simply set a larger minimum threshold for message size before stubbing the message, however it does raise the question if this practice of removing data from Exchange is actually adding any significant benefit or just adding complexity and cost?
Way back when I was responsible for running a messaging service I liked archiving technology because it reduced my storage costs and improved my search capability. Sure it was seriously expensive software and was never exactly reliable, but it served a purpose and I was happy to pay both the monetary and operational costs since I calculated it was still better than trying to scale my messaging system to cope with the storage demands i had at the time. Given the changes in Exchange Server 2007 and even more so in Exchange Server 2010 I don't think the same approach remains valid, my storage is much cheaper and I no longer require additional indexing for client search, I also have legal hold and online archive to keep my internal auditors happy. Given significant compliance requirements my view is that a 3rd party journal solution would still be required to capture envelope data in Exchange Server 2010, however since journaling does not make changes to the Exchange databases, this does not suffer from the same database page fragmentation that message stubbing does, even more importantly the journal data is kept separately, so my compliance data is safely secured and my Exchange data is kept in Exchange so that end users can access their data from whichever devices they like.
Should you use archiving technology with Exchange Server 2010? Well, as with lots of things in the technology game its your choice, I would however urge caution before just carrying over your archiving technology from previous versions of Exchange to Exchange Server 2010.