How Microsoft Uses File Classification Infrastructure

Quite a while ago, I blogged about the File Classification Infrastructure in Windows Server 2008 R2:

In my opinion, this is an interesting tool, built in to your server platform.

Now, we just published a paper about how we use this File Classification infrastructure to protect PII. This is an interesting read: Microsoft IT Uses File Classification Infrastructure to Help Secure Personally Identifiable Information

Here is the summary:

In today's high-tech world, collecting and storing data are business-critical processes that form an integral component of daily operations. However, the ever-increasing dependency on and use of electronic data also make data management more challenging—especially in light of government regulations for the appropriate use and storage of personally identifiable information (PII) and financial information. Improper storage of PII can also be a significant financial concern, as the cost of storage-related security breaches can be hundreds of dollars per record.

Microsoft Information Technology (IT) had been using an internally built solution to help secure personally identifiable information (PII), financial information, and other types of sensitive data by classifying internal file shares and Microsoft® SharePoint® sites. However, this solution was limited to defining information sensitivity at a file-share level. It also required each user to specify the sensitivity level of his or her file shares manually, which frequently led to mislabeled information.

This custom, internally developed solution also had a high total cost of ownership, requiring a significant amount of development and maintenance resources to fix identified issues and keep the system up to date, as each upgrade to the storage operating systems required upgrading the code.

Microsoft IT needed a solution that would bring consistency to the file classification process across all teams, and be able to scan content automatically at the file level for key words, terms, and patterns. It then had to apply the correct rights management protection based upon predefined security policies. Cost of ownership and performance were also important drivers for developing a new solution. Microsoft IT needed a system built from off-the-shelf, standardized Microsoft technology, that could scale across terabytes of data. With such a large amount of information, the solution had to be efficient at scanning files while maintaining a high degree of accuracy when identifying sensitive PII.

Roger