Handling Migrations - StalledDueToTarget_MdbFull or TargetDatabaseFullPermanentException

When an administrator provisions a migration batch to Office 365 and adds users, objects known as migration users are created, which you can view using Get-MigrationUser. When migration users and the corresponding move requests (Get-MoveRequest) are created an Exchange Online mailbox database is assigned to each mailbox that will be migrated.  Exchange Online uses an algorithm to review available databases within the service and subsequently select the best target database. Among other data points, one of the criteria considered in the algorithm is the amount of available free space the database file is allowed to consume.

 

Some customers have recently reported that during the mailbox finalization process, the mailbox move may stall with the following error: StalledDueToTarget_MDBFull or TargetDatabaseFullPermanentException.

 

Why does this stall occur?

 

As mailbox data is migrated into the database space within the database file. Exchange Online servers cap the size of mailbox database files to ensure that sufficient free space exists in the database to process mail and client transactions, and to prevent a single database file from consuming more disk space than expected.

 

StalledDueToTarget_MDBFull is our way of notifying the administrator that we have reached the threshold for the minimum free space remaining in the database, or that the database file has reached its maximum allowable size. 

 

TargetDatabaseFullPermanentException is similar in nature.  Often the move reports will include space information regarding the state of the target database.  Here is an example:

 

Message : Target database GUID cannot be used:
Current database file size: 1464986501120
Current space available inside database: 1743781888
Allowed database growth percentage: 90
Maximum database file size limit: 1622722691784
Is database excluded from provisioning: 'False'.

 

The service will allow a database to grow to 90% space utilization.  The reserve ensures that users already utilizing the database are safe and further database operations would not impact the stability of the service.  In this case 146498956501120 / 1622722691784 = 90.2% (therefore greater than the 90% limit imposed).

 

How does the administrator handle this scenario?

 

There are three options available for an administrator to deal with this scenario.

 

The first option is to do nothing.  Within Exchange Online, there are service processes that dynamically move mailboxes to redistribute load across multiple databases. Exchange Online is aware of pending migrations and in the background, redistributes mailboxes to allow the free space to decline below the stalled threshold.  When the free space has increased to a level that would safely allow for migrations to complete – the stall condition will be resolved, and the migration will complete. This could take several hours or days between when the stall is first encountered and when free space is below the threshold to allow for continued moves.

 

The second option that may help is to lower the amount of time between the initial synchronization of the mailbox and the finalization of the mailbox move.  In most customer engagements there were several days to weeks between the initial synchronization of the mailbox and when the administrator issued the finalization of the mailbox move.  The stall is then encountered in the incremental synchronization process that attempts to finalize the move and copy the remaining data.  When shortening the time between the initial synchronization and the finalization, you decrease the likelihood that the space has been consumed by other migration and client activity, thereby increasing the success rate of mailbox move finalizations. 

 

The third option is to delete and recreate the move.  In this case, all previously migrated data is deleted, and the migration starts over again.  This can be time-consuming and does not guarantee that you will not see this issue again should you have a longer delay between initial synchronization and finalization.