Tag-inator 3: Rise of the Machines

Can a machine categorize or tag email any better than a person?

In our previous post, we explored how to make it easy for users to categorize an email inside of Outlook. But what if the user still isn't doing it, in spite of how easy we've made it? Or what if the user has good intentions, but makes an honest mistake? Is there any way to automate this?

Yes we can automate this, with Exchange 2007 Transport Rules. Does it do a "better" job than a human? Let's take a look.

First off, here's a nice summary of the "Transport Rules" feature on the MS Exchange Team blog. It's useful for many things besides message classification: https://msexchangeteam.com/archive/2006/12/12/431879.aspx

Next, let's take a sample email, and see how a transport rule would work.

FROM: Legal Department

TO: Executive Staff

SUBJECT: Courtroom strategy for Them v. Us

Please do not bring your Blackberry into the courtroom on days when you are testifying. It makes us look bad, and if you keep looking at your Blackberry while giving answers, the opposing council may want to see it, too.

Transport rules can use many fields as inputs (also called 'conditions' or 'predicates') when making a decision about whether or not to classify an email. For example, in the sample email above, the fact that the sender was a member of the Legal group, and/or the recipient was a member of the Executive staff can be considered by the transport rules engine. Also, the appearance of the phrases "Them v. Us" and "courtroom strategy" in the body of the message can be added to the evaluation.

Here's a complete list of conditions/predicates that Transport Rules can use to classify an email: https://technet.microsoft.com/en-us/library/aa995960.aspx

Is it a guarantee that the sample message above is attorney-client privileged? No, but it's likely. And knowing that is valuable, especially when it can be calculated for free.

Transport Rules bring three benefits to message classification:

1) They’re always working. They don't get tired, or forgetful, or confused. They don’t get in a hurry on Friday afternoon. They always fire, no matter what.

2) They reduce the search scope. You're still going to have lawyers review emails as part of the e-discovery process, but anything you can do to reduce the search scope will save you money. If an expert is double-checking the 5% of messages that the machine thinks might be privileged, you’ve just made your search problem 20 times smaller.

3) Classification happens immediately when the message is sent, there's no waiting around for a skilled person to review it. Suppose in our example, two employees who are not members of the legal team are discussing "Them v. Us." Their conversation might be privileged and they don’t know it. Or their conversation might be forbidden by company policy. Or any email discussing the case cannot be sent outside the company. Transport Rules can help with all of that.

Transport Rules can do more than just classify.

Expanding upon example 3 above, I noted you can suppress (not deliver) certain kinds of emails. There's a whole host of actions that can be taken based upon the message's contents, including:

  • Notifying managers,
  • Logging the message in a special archive,
  • Altering the contents by appending a disclaimer,
  • Etc.

Here's a complete list of actions that can be take: https://technet.microsoft.com/en-us/library/aa998315.aspx

Regular Expressions in Transport Rules

People sometimes ask me, "What kinds of patterns can I use? Can I trap social security numbers (nnn-nn-nnnn) or credit card numbers (nnnn-nnnn-nnnn-nnnn) in emails?"

The answer is yes, but it gets pretty geeky pretty fast. Exchange uses a technology called Regular Expressions. These are pretty common in the world of programming (and Unix administration), but not for the faint of heart. Here's a quick primer on what they are : https://en.wikipedia.org/wiki/Regular_expressions

What is so great about this feature is that it has all the power and flexibility of serious computer programming, but the features are exposed right in the Exchange Management Console: All you have to do is type your expression into a dialog box, there's no scripting or programming required to use the feature.

Here are specifics for adding Regular Expressions to your Exchange Transport Rules: https://technet.microsoft.com/en-us/library/aa997187.aspx

In conclusion, an existing out-of-box Exchange feature can automatically categorize email, with varying degrees of precision depending upon how complex a rule you want to write. If you're concerned about end-user compliance with tagging schemes, or just want an extra layer of security and common sense around your message handling, Transport Rules are for you.