Why messaging should be a tier-one service

Have you ever had your your phone system go down in your workplace?  Did it effect your work?  Most of us would agree that having a phone available at all times is a requirement for almost all white-collar jobs.  That's why companies make sure that their phone systems are considered tier-one services.

How many times do you read an email or send an email each day?  I would bet that for the majority of us, we use Outlook more often than we use the phone.  Yet in many companies the IT management has not found the need to put the resources into making sure that the messaging infrastructure is up 99.9% of the time.  I get the opportunity to see many different companies environments and I am constantly surprised at the lack of hardware that Exchange is running on.  As my friend Bruce McKinstry is fond of saying: “My XBOX is more powerful than your Exchange server!”  It is sad that a gaming console used by a maximum of 4 people is more powerful than what some customers run their businesses on with 600+ mailboxes on them.  Sometimes I think that Microsoft should offer hardware boxes that have Exchange 2003 configured on them so that much of the guess work is taken out of deployment.  We did it with XBOX...  Hmmm.

Ok, what is required to make sure that your Exchange servers are up to the task of running your business?

  • Clustering:  You can't imagine how much better Microsoft has gotten in this area in Exchange 2003 compared to Exchange 2000.  I used to ask customers not use clustering because their downtime might have actually been higher with Exchange 2000 if they did use it.  (I am a Support Consultant and one of the key tenants of Microsoft is to be honest...)  But this has changed since Exchange 2003.  Many of my customers are going with clusters now and I am fully behind going this way.
  • Monitoring:  Years ago, before I came to Microsoft, I was a consultant for one of the largest Exchange organizations in the world, and they were looking to reviewing many of the monitoring systems out there.  We looked at HP, NetIQ, and others.  Eventually a decision was made and an excellent product was purchased.  Years later, I wasn't surprised to hear that the monitoring system hadn't been fully configured yet.  That is why I love MOM.  (Sounds like a tattoo...)  It is configured out of the box.  I don't care which monitoring solution you purchase, but just make sure that you use it, ok?
  • Backup and Restore:  A lot of companies have spent considerable resources on backup hardware, yet never run fire drills to test the backups.  I am never surprised when the restore fails.  I once had a customer that had a database so large it had to be spanned across 2 tapes.  That customer would remove one tape each night and replace it with a blank one.  Since they never tested the restores, they didn't find out until it was too late that they had only half of the backup since they should have been replacing BOTH tapes each day.  Fire drills also help make sure that your processes are in place when a true disaster actually happens.
  • Training:  You can't have too much of this.  I've just never heard any IT operations manager say: “You know, I wish my Exchange administrators knew less about the product.“  The more you learn, the more likely you will be able to tweak the system to perform better and the more likely you will be able to diagnose a possible issue before it occurs.
  • Testing Lab:  I once worked for a customer that had an exact replica of their production environment in their test lab.  I mean all the way down to the redundant ATM network.  OK, they were military and had a huge budget, but my point is that they never rolled out a change on their production network with out testing its consequences thoroughly on the test lab.  In the years I was with them, they rarely had any issues on their production systems.  Your lab doesn't have to be to that extreme, but at least test changes in a small test lab before rolling out any changes.  Please?
  • Security, UCE (SPAM or SPAMM), and anti-virus:  I put all of these together because they are all some of the major causes of systems being down.  You servers should be secured according to the directions found at the following location: https://www.microsoft.com/technet/prodtechnol/exchange/2003/library/exmessec.mspx.  Actually pretty much all that could be said here can be found at https://www.microsoft.com/exchange/techinfo/security/default.asp.  Go read it.

So the list above will help you get your systems in top notch shape.  It is by no means a complete list and I am sure that I have forgotten something important, but perhaps I will mention more later.  Gotta get back to work...