TooManyBadItemsPermanentException error when migrating to Exchange Online?


Some of you may have noticed that more migrations might be failing due to encountering 'too many bad items'. Upon closer review, you may notice that the migration report contains entries referencing corrupted items and being unable to translate principals. I wanted to take a few minutes and provide more information to help understand what this means, why these are now occurring, and what can be done about them. Ready to geek out?

During a mailbox migration, there are several stages we go through. We start off with copying the folder hierarchy (including any views associated with those folders), then perform an initial copy of the data (what we call the Initial Sync). Once the initial data copy process is complete, we then copy rules and security descriptors. Reviewing a move report shows entries similar to these.

Stage: CreatingFolderHierarchy. Percent complete: 10
Initializing folder hierarchy from mailbox <guid>: X folders total
Folder hierarchy initialized for mailbox <guid>: X folders created
Stage: LoadingMessages
Copying messages is complete. Copying rules and security descriptors.

For our discussion today, we are interested in the stage of “Copying rules and security descriptors”. Security descriptors are Access Control Lists (ACLs), which are then comprised of Access Control Entries (ACEs, or the individual permissions entries) and stored in SDDL format. In the context of a mailbox, we include both the Mailbox security descriptor (Mailbox permissions) as well as Folder security descriptors (permissions on individual folders). When we look at the Mailbox Security descriptor, it should be noted that only Explicit mailbox permissions are copied. These would include permissions granted by using the Add-MailboxPermission cmdlet, by using the Exchange Management Console (2010) or Exchange Admin Center (2013 and 2016) to add Full Access rights. Any Inherited permissions are not evaluated during the copy process. For example, granting the Receive-As permission on a database object in Active Directory results in an Inherited Allow for Full Access for all mailboxes on that database. When mailboxes on that database are migrated to Exchange Online, those Inherited permissions will not get copied.

Now that we have briefly covered security descriptors, let’s look at the issue. About midway through 2016, a change was introduced to Exchange Online whereby if a security principal could not be successfully validated/mapped to an Exchange Online object, it would be marked as a bad item. Previously, the behavior was that invalid permissions would simply be ignored, and administrators were then left to wonder why some permissions no longer worked after the migration. With this new behavior, corrupt/invalid permissions are now logged so that administrators will know that there are problems with permissions. From my perspective as a Support Engineer, this is a change for the better because as Administrators, you are now able to see when there are issues with permissions. It is possible that this behavior will continue to evolve over time, but I would advise to become familiar with this new behavior so that you understand what is happening.

Now how does this affect you? Since we are now incrementing the bad item count for each corrupt/invalid permission, this means that if we encounter more corrupt/invalid permissions than your current bad item limit is set to (default is 10 for a migration batch), the migration will fail. Depending on the state of permissions, you could potentially see a LOT of bad entries being logged. If you are looking at the migration report text file (downloadable from the Exchange Online Portal), you may see entries similar to the following:

11/12/2016 8:44:43 AM [EXO MRS Server] A corrupted item was encountered: Unable to translate principals for folder "Folder Name"/"FolderNTSD": Failed to find a principal from the source forest.
5/19/2016 6:33:50 PM [EXO MRS Server] A corrupted item was encountered: Unable to translate principals to the target mailbox: Failed to find a principal in the target forest that corresponds to the following source forest principal values: Alias: <alias>; DisplayName: <Display Name>; MailboxGuid: <mailbox guid>; SID: <SID of User>; ObjectGuid:
<Object GUID>; LegDN: <legacyExchangeDN>; Proxies: [X500:<legacyExchagneDN format>; SMTP:user@contoso.com;];.
5/19/2016 6:33:50 PM [EXO MRS Server] A corrupted item was encountered: Unable to translate principals to the target mailbox: Failed to find a principal in the target forest that corresponds to the following source forest principal values: SID: <SID of User>; ObjectGuid: <Object GUID>;.

So, what is the logic used to validate permissions?

I’m glad you asked! Here is the process spelled out. There are four basic steps to this process, broken out as follows.

  1. Exchange Online - I need to resolve this SID which is present in the security descriptor (Folder or Mailbox)
  2. Exchange Online - Make a request to the On-Premises MRS Proxy, passing the SID to resolve
  3. On-Premises MRS Proxy - Look up the SID against Active Directory and return a set of attributes (including primary SID and legacyExchangeDN)
  4. Exchange Online – Take the legExchangeDN value provided, and attempt to match it up with a user account in the cloud which has that stamped as an X500 proxy address.

Normally, Directory Synchronization will take care of stamping the legacyExchangeDN from each side as an X500 proxy address, but this does mean that the On-Premises legacyExchangeDN must match a Mail-enabled recipient (i.e. Mailbox, MailUser, Mail-enabled Security Group) in the cloud by an X500 Proxy. If it does not, then resolving that permission entry will fail.

I do want to differentiate between the different types of permissions errors you may see.

SourcePrincipalMappingException – these mean that when MRS Proxy tried to look up the SID against On-Premises Active Directory, it couldn’t be resolved. This is a common scenario when users leave the company and their accounts are deleted. You could also encounter these issues if the SID in question is part of the SIDHistory of an On-Premises account. When MRS Proxy attempts to look up the SID, we only search by ObjectSID or msExchMasterAccountSID. MRS Proxy does not evaluate against SIDHistory, so the SID failing to be resolved would be expected behavior. SIDHistory being populated won’t be a common scenario, but it is nonetheless something to be aware of.

Note: Exchange Online has a special built-in bad item limit of 1000 for these Source Principal Mapping errors, so these moves will not fail unless you encounter more than 1000 of these types of bad items.

TargetPrincipalMappingException – these mean that we can’t map the permission to a user account in the Target forest (Exchange Online). A common scenario here would be if a user or group was given permissions on a mailbox, but that user or group is not in your dirsync scope. After trying to move that mailbox via MRS, that user or group is not going to be present in Exchange Online, so this error would be expected. Another scenario is if a security group (not mail-enabled!) was used to assign permissions. Non mail-enabled security groups are not synchronized to Exchange Online, so they won’t exist in the Target forest.

To resolve this issue, there are really two options.

  1. Increase the bad item limit to account for permissions errors. In complex legacy environments where multiple Exchange versions have been in place, and there has been a lot of user turnover, I’ve seen where permissions errors can number into the thousands. Be prepared that you may need to increase the bad item limit to a number higher than you expect. The good news here is that with improvement to Exchange over the years, the odds of encountering actual bad messages is relatively slim, so odds are good that the vast majority of bad items are bad permissions. The second bit of good news here is that we log the type of bad item that is encountered and make this information available in the move report. I’ll show you how to dig into a move report and look at the bad items later on in this blog post.
  2. Cancel the move, fix the bad permissions from the folder or mailbox by either removing them or fixing the issue causing the user/group to not be resolved in Exchange Online, and then submit the move again. But – you may ask – what if I want to fix the permissions on the current move and then resume it? Well, I’m not going to stop you from fixing bad permissions. But I will tell you that it won’t make any difference for the current move. We only evaluate permissions once, at the end of the initial data copy. If the move fails due to bad items (permissions), even if you fix the bad permissions we won’t re-evaluate the now fixed-up permissions and allow the move to complete successfully. You either have to up the bad item limit, or remove the move and fix the permissions and submit a new move.

Now, I promised earlier that I would go through how to review the permissions errors. You can do this by using PowerShell and saving the move report into a variable where it is stored in memory. I typically have the move report exported out to an XML file because I don’t have direct access to customer tenant information. If you are reviewing failed moves within your own tenant, there is no need to do that if you don’t want. I’ll provide the context to do both just in case you want to know both methods.

To save the move report to a variable, you would run the following from PowerShell connected to Exchange Online.

$movereport = Get-MoveRequestStatistics <move request identity> -IncludeReport

To save the move report to an XML file, then import the XML file into PowerShell, you would run the following from PowerShell connected to Exchange Online.

Get-MoveRequestStatistics <move request identity> -IncludeReport | Export-CliXml c:\temp\movereport.xml

Once the file is saved, then you import it into PowerShell. Note that this PowerShell instance does not have to be connected to Exchange Online. It can be just a regular PowerShell instance.

$movereport = Import-CliXml c:\temp\movereport.xml

If you never dug into a move report, let me just say that there are all sorts of golden nuggets of information buried inside (which won’t show in the text file from the Portal, by the way!)

Now that you have the move report imported as a variable, you can access all the rich information within the report. We specified our variable earlier as $movereport, so we just need to call that variable, and access the information stored inside it.

$movereport.report.baditems – this gives you a list of all the bad items encountered. A cool tip is that you can use the Out-GridView PowerShell function to open another window with the list.

$movereport.report.baditems | Out-GridView

What is nice about the Grid View is that you can then filter the output. For example, to validate that all of your bad items are permissions errors, you can simply choose “Add criteria”, check the “Kind” box, and click “Add”.

image

Change “Contains” to “Does not contain”, and type Security. This will quickly show you if there are any other types of bad items.

image

Now that we have identified the behavior change, and gone over how to address it, let’s end by talking about what approach should be taken for migrating mailboxes.

The recommended approach to this new change in behavior would be to continue to migrate using low bad item counts, and then manually remediate those that fail. We recommend this approach because migrations that fail would indicate either a LOT of bad source permissions (more than 1000), or it indicates there are valid, working permissions On-Premises that are failing to be correctly mapped to objects in Exchange Online. Both of these conditions should not be common, so investigation would be warranted to ensure that you are in fact dealing with bad permissions.

Special thanks to Brad Hughes and the rest of the MRS team for their assistance and review of this content.

Ben Winzenz

Comments (13)
  1. Gavin Morrison says:

    Thanks for this Ben. Could I make a suggestion that permissions errors get their own counter so that we can more easily differentiate between items that fail to migrate and permissions?

    Alternatively, would it be possible to roll up permissions errors so that User X who left the company, but who still has permissions to 1,000 different folders within a mailbox, only counts as a single error?

    1. @Gavin – thanks for the suggestions. I’ll make sure that the migration team gets this feedback.

  2. Svetoslav A. says:

    Nice article. However, unfortunately it is not the case that only the objects synchronized from AADConnect appear in ExO. Permissions granted over objects like Domain Admins, Account Operators, etc. (if present on the source mailbox, having explicitly granted permissions), are also happily transferred to ExO, being mapped to the respective tenant Domain Admins, Account Operators, in quite a bizarre way (map from a group which is not synced from on-prem to a group that the O365 customer has no control over).

  3. Jeff Guillet says:

    Great article! I sure wish PowerShell would produce more meaningful error messages.

    Is there a way to know ahead of time when we will run into these errors? It’s a shame we have to find out after the mailbox has been moved.

    1. Thanks Ben, awesome information!

      It would be handy to run a pre-flight check. Migrating a 30Gb mailbox with a 50Gb archive, only to find out at the end that it failed, will be a pain – and I know it isn’t an unrealistic scenario.

      Something similar to IdFix, but in the context of this article, would be handy.

    2. @Jeff – unfortunately about the only way I know of to check beforehand would be to use Get-MailboxPermission and Get-MailboxFolderPermission and look for entries where a SID is present, which indicates that it can’t be resolved to a valid account. The TargetPrincipalMappingException isn’t normally expected, so this one you can’t really know ahead of time.

  4. Ben, I have just gone through a long and painful PST import process just recently, having to clean up after a botched migration.

    There was a clean PST export from the on-prem Ex2010 mailboxes, followed by an import to O365 of those PST files. Many of the imports failed with hundreds of “bad items”.

    To the best of my knowledge, bad items don’t make their way into the PST file during an export. So why that many bad items? Surprise #1.

    Moreover, unlike migration logs, the PST import logs failed to identify the invalid items, or these items were deemed “bad”. Surprise #2.

    I opened a case with Microsoft support. They were unable to give me a satisfying answer. “Dunno. Can’t tell.” was the answer. Surprise #3.

    Are there any plans to document how to identify what failed in a PST import, or whether those failures can be ignored or should be investigated?

    Thanks.

  5. sanjays11 says:

    Ben, Not a powershell expert, how to view result for command $movereport.report.baditems | Out-GridView. Great Article

  6. satya11 says:

    Hi , I have exported xml file for public folder migration ,but not able to see any output

    S C:\Windows\system32> $movereport = Import-CliXml E:\watgpf\fail\mailbox4.xml
    S C:\Windows\system32> $movereport.report.baditems | Out-GridView

    1. sanjays11 says:

      It might be possible that there were no BadItems in the report, u can try checking $movereport.report.failures | Out-GridView

  7. Satyajit321 says:

    Great Article!
    This seems to explain the astonishing Bad item numbers on fairly working mailboxes.
    Real important point on SIDs\objects that are not synced\ or sync errors to cloud, adding up to these failures.

    Looking forward for some more articles, outlining\explaining these specific errors or the migration process.

  8. Ben Owens says:

    Thanks for explaining the change in behaviour. I have put together a script which can be run against a migration batch to report how many mailboxes include corruptions, how many of those are genuine corruptions, and how many are related to SourcePrincipalMappingException or TargetPrincipalMappingException errors. https://www.teamas.co.uk/2017/07/exchange-hybrid-mailbox-move-corruption.html

    The blog states there is a special built-in bad item limit of 1000 for these Source Principal Mapping errors. However, I have encountered a failed migration for a mailbox with 107 SourcePrincipalMappingException errors, and no other corruptions. The last entry in the ‘report.failure’ output stated “This mailbox exceeded the maximum number of corrupt or missing items that were specified for this request.”

    In another instance however, I had a mailbox with 48 SourcePrincipalMappingException errors but I could complete that mailbox move. For both migration moves the baditem limit was set to ‘0’.

    Has the inbuilt limit for Source Principal Mapping errors changed from 1000 to 100? How can I determine which limit I have hit and where that is set?

Comments are closed.

Skip to main content