Easy Configuration of the Azure Information Protection Scanner


NOTE: This content has been moved to https://aka.ms/AIPBlog and will be maintained/updated at that location moving forward. It is recommended that you use that location for AIP Scanner installs.

The Scenario:

The EU General Data Protection Regulation (GDPR) took effect on May 25, 2018 and marks a significant change to the regulatory landscape of data privacy.  ​The aim of the GDPR is to protect all EU citizens from privacy and data breaches in an increasingly data-driven world.  Organizations in breach of GDPR can be fined up to 4% of annual global turnover or €20 Million (whichever is greater).  Needless to say, this has motivated organizations worldwide to better classify and protect sensitive personal data to protect against breach.  One of the ways to accomplish this is to protect everything sensitive using Azure Information Protection.

Azure Information Protection allows data workers to classify and optionally protect documents as they are created.  There are also options for automatically classifying/protecting emails as they are sent through your Exchange server or Exchange Online, and SharePoint Online can be protected using Microsoft Cloud App Security AIP integration.  These options go a long way to protect newly created data and data migrated to the cloud, but what about the terabytes of data sitting on File Shares and On-Premises SharePoint 2013/2016 servers? That is where the AIP Scanner comes in.

The Solution:

The Azure Information Protection Scanner is the solution for classifying and protecting documents stored on File Shares and On-Premises SharePoint servers. The overview below is from the official documentation at https://docs.microsoft.com/en-us/information-protection/deploy-use/deploy-aip-scanner.  This blog post is meant to assist customers with deploying the AIP Scanner, but if there is ever a conflict, the official documentation is authoritative.

Azure Information Protection scanner overview

The AIP Scanner runs as a service on Windows Server and lets you discover, classify, and protect files on the following data stores:

  • Local folders on the Windows Server computer that runs the scanner.
  • UNC paths for network shares that use the Common Internet File System (CIFS) protocol.
  • Sites and libraries for SharePoint Server 2016 and SharePoint Server 2013.

The scanner can inspect any files that Windows can index, by using iFilters that are installed on the computer. Then, to determine if the files need labeling, the scanner uses the Office 365 built-in data loss prevention (DLP) sensitivity information types and pattern detection, or Office 365 regex patterns. Because the scanner uses the Azure Information Protection client, it can classify and protect the same file types.

You can run the scanner in discovery mode only, where you use the reports to check what would happen if the files were labeled. Or, you can run the scanner to automatically apply the labels.

Note that the scanner does not discover and label in real time. It systematically crawls through files on data stores that you specify, and you can configure this cycle to run once, or repeatedly.

Prerequisites:

To install the AIP Scanner in a production environment, the following items are needed:

  • A Windows Server 2012 R2 or 2016 Server to run the service
    • Minimum 4 CPU and 4GB RAM physical or virtual
    • Internet connectivity necessary for Azure Information Protection
  • A SQL Server 2012+ local or remote instance (Any version from Express or better is supported)
    • Sysadmin role needed to install scanner service (user running Install-AIPScanner, not the service account)
    • If using SQL Server Express, the SQL Instance name is ServerName\SQLExpress
  • Service account created in On Premises AD and synchronized with Azure AD (I will call this account AIPScanner in this document)
    • Service requires Log on locally right and Log on as a service right (the second will be given during scanner service install)
    • Service account requires Read permissions to each repository for discovery and Read/Write permissions for classification/protection
  • AzInfoProtection.exe available on the Microsoft Download Center (The scanner bits are included with the AIP Client)
  • Labels configured for Automatic Classification/Protection

Installation:

Here is where the Easy part from the title gets started.  Installation of the AIP Scanner service is incredibly simple and straight-forward.

  1. Log onto the server where you will install the AIP Scanner service using an account that is a local administrator of the server and has permission to write to the SQL Server master database.
  2. Run AzInfoProtection.exe on the server and step through the client install (this also drops the AIP Scanner bits)
  3. Next, Right-click on the Windows Windows button in the lower left-hand corner and click on Command Prompt (Admin)

    Start Menu
  4. Type PowerShell and hit Enter
    PowerShell
  5. At the PowerShell prompt, type the following command and press Enter:
    Install-AIPScanner
  6. When prompted, provide the credentials for the scanner service account (YourDomain\AIPScanner) and password
  7. When prompted for SqlServerInstance, enter the name of your SQL Server and press Enter
    You should see a success message like the one below
    Message
  8. Right-click on the Windows Windows button in the lower left-hand corner and click on Run
    run
  9. In the Run dialog, type services.msc and click OK
    Services
  10. In the Services console, double-click on the Azure Information Protection Scanner service
  11. On the Log On tab of the Azure Information Protection Scanner Service Properties, verify that Log on as: is set to the YourDomain\AIPScanner service account
    logon

See, told you it was easy to install.  Luckily, configuring the service is only slightly more challenging. 🙂

Scanner Configuration:

OK, this next part is not super simple but it isn't terrible either as long as you don't miss anything.  Luckily, you can follow my steps to make it as easy as possible.

Authentication Token:

  1. On the server where you installed the scanner, create a new text document on the desktop and name it Set-AIPAuthentication.txt
    • In this document, paste the line of PowerShell code below and save
      Set-AIPAuthentication -webAppId <ID of the "Web app / API" application> -webAppKey <key value generated in the "Web app / API" application> -nativeAppId <ID of the "Native" application >
  2. Open Internet Explorer and browse to https://portal.azure.com
  3. At the Sign in to Microsoft Azure page, enter the your tenant admin credentials
  4. In the Microsoft Azure portal, click on Azure Active Directory in the left-hand pane
  5. Under Manage, click on App registrations
  6. In the App registrations blade, click the + New application registration button
  7. In the Create blade, use the values in the table below to create the registration
    Name AIPOnBehalfOf
    Application type Web app / API
    Sign-on URL http://localhost

  8. Click the Create button to complete the app registration
  9. In the AIPOnBehalfOf blade, hover the mouse over the Application ID and click on the Click to copy icon when it appears
  10. Minimize (DO NOT CLOSE) Internet Explorer and other windows to show the desktop
  11. On the desktop, return to Set-AIPAuthentication.txt and replace <ID of the "Web app / API" application> with the copied Application ID value
    and Save

    WARNING: Ensure there is only a single space after the Application ID before -webAppKey
  12. Return to the browser and click on the Settings button
  13. In the Settings blade, under API ACCESS, click on Keys
  14. In the Keys blade, add a new key by typing AIPClient in the Key description field and your choice of duration (1 year, 2 years, or never expires)
  15. Select Save and copy the Value that is displayed
    WARNING: Do not dismiss this screen until you have saved the value as you cannot retrieve it later
  16. Go back to the txt document and replace <key value generated in the "Web app / API" application> with the copied key value
    and Save

    WARNING: Ensure there is only a single space after the Application Key before -nativeAppId
  17. In the Microsoft Azure portal, click on Azure Active Directory in the left-hand pane
  18. Under Manage, click on App registrations
  19. In the App registrations blade, click the + New application registration button
  20. In the Create blade, use the values in the table below to create the registration
    Name AIPClient
    Application type Native Application
    Sign-on URL http://localhost

  21. Click the Create button to complete the app registration
  22. In the AIPClient blade, hover the mouse over the Application ID and click on the Click to copy icon when it appears
  23. Replace <ID of the "Native" application > in the Set-AIPAuthentication.txt document with the copied Application ID value
    and Save
  24. Return to the browser and in the AIPClient blade, click on Settings
  25. In the Settings blade, under API ACCESS, select Required permissions
  26. On the Required permissions blade, click Add, and then click Select an API


    NOTE: It may take a few moments for each of these blades to load
  27. In the search box, type AIPO and click on AIPOnBehalfOf, and then click the Select button
  28. On the Enable Access blade, check the box next to AIPOnBehalfOf, click the Select button

  29. Click Done
  30. Return to the PowerShell window and paste the completed command from Set-AIPAuthentication.txt and press Enter
  31. When prompted, enter the user AIPScanner@yourdomain.onmicrosoft.com and the password
    NOTE: Replace tenantname with the your tenant
  32. You should see a prompt like the one below. Click Accept
  33. You will see the message below in the PowerShell window once complete

About Policies:

Now that the scanner is happy and fully authenticated, we should discuss what you want to do with the AIP Scanner.  We know that you want to use it to scan file shares and SharePoint sites, but some discussion needs to be had about how the scanner locates data and what the scanner will do once it finds it.  This may be a no brainer to some so feel free to skip this and move on to the next section if you like.

AIP Policies are made up of Labels and Sub-labels that allow you to classify and optionally protect data.  You can assign conditions to labels using the standard Office 365 DLP templates and have those conditions be recommended or automatic.  For the AIP Scanner to classify documents, you must set these conditions to be Automatic.  This allows the AIP Scanner to protect content without the need for user input.  This is a content based approach and labels are assigned to content based on the conditions defined in each label.  If you want all of the documents in your repositories to be classified, then you can use the default label setting in the portal and the AIP Scanner will assign that label to any content that does not meet any other automatic criteria. This is done in the Global policy blade, under the Configure settings to display and apply on Information Protection end users section.

For more in-depth information about configuring policies, you can see the official documentation at https://docs.microsoft.com/en-us/information-protection/deploy-use/configure-policy-classification

Configuring Repositories:

Finally, it is time to put the AIP Scanner to work scanning repositories.  These can be on-premises SharePoint 2013 or 2016 document libraries or lists and any accessible CIFS based share.  Keep in mind that in order to do discovery, classification, and protection, the scanner service pulls the documents to the server, so having the scanner server located in the same LAN as your repositories is recommended. You can deploy as many servers as you like in your domain, so putting one at each major site is probably a good idea.

  1. To add a file share repository, open a PowerShell window and run the command below
    Add-AIPScannerRepository -Path \\fileserver\documents
  2. To add a SharePoint 2013/2016 document library run the command below
    Add-AIPScannerRepository -Path http://sharepoint/documents
  3. To verify that the repositories that are configured, run the command below
    Get-AIPScannerRepository

  4. Run the command below to run an initial discovery cycle
    Set-AIPScannerConfiguration -Schedule OneTime

    NOTE: Although the scanner will discover documents to protect, it will not protect them as the default configuration for the scanner is Discover only mode
  5. Start the AIP Scanner service using the command below
    Start-Service AIPScanner
  6. Right-click on the Windows Windows button in the lower left-hand corner and click on Event Viewer
  7. Expand Application and Services Logs and click on Azure Information Protection
  8. You will see an event like the one below when the scanner completes the cycle

    NOTE: You may also browse to %localappdata%\Microsoft\MSIP\Scanner\Reports and review the summary txt and detailed csv files available there
  9. At the PowerShell prompt type the command below to enforce protection and have the scanner run once
    Set-AIPScannerConfiguration -Enforce On -Schedule OneTime -Type Full

    NOTE: After testing, you would use the same command with the -Schedule Continuous command to have the AIP Scanner run continuously
    NOTE: The -Type Full switch forces the scanner to review every document. 
  10. Start the AIP Scanner service using the PowerShell command below
    Start-Service AIPScanner
  11. In the Event Log, you will now see an event that looks like the one below

And that's all there is to setting up the AIP Scanner! There are many more options to consider about how to classify files and what repositories you want to configure, but I would say that it is fairly simple to set up a basic scanner server that can be used to protect a large amount of data easily.  I highly recommend reading the official documentation on deploying the scanner as there are some less common caveats that I have left out and they cover performance tips and other nice additional information.

I hope this was helpful. Please rate the article if it was helpful and let me know in the comments below if I missed anything or if anything is not clear. Check out the rest of my content at https://aka.ms/Kevin

Kevin

Comments (2)

  1. Jaehun Kim says:

    I only see error message is ‘Failed to install Azure Information Protection Scanner service
    The installation failed, and the rollback has been performed.’ after run ‘Install-AIPScanner’ command.

    If successful AIP Scanner installation that how use SQL server instance name?
    (I am try instance name is ‘MSSQLSERVER’, computername and ‘computername\MSSQLSERVER’)

    1. Hi Jaehun,

      I have just updated the log and migrated it to the new official AIP Blog at https://aka.ms/AIPBlog. Please follow the instructions there as they are more current and let me know if you still have issues. To answer your question about SQL, if you are using a default instance of full sql server you will only provide the Servername when it asks for SQL Server. If you used SQL Server Express, you would provide Servername\SQLExpress. If you have a named instance you would provide Servername\NamedInstance. Hope that helps!

      Kevin

Skip to main content