Tuning FIM Service MA Export Processing

An introduction to FIM Service MA export configuration, system event requests, and FIMService partitioning.  This applies to both FIM 2010 R2 and MIM 2016.

Credits:

Thank you to David Steadman for his collaboration in this post.

Introduction

When working with the FIM Service management agent, it’s possible to get into a situation where an export operation can cause periodic performance issues. This is usually due to the export operation overwhelming either the FIMService instance, the SQL Server instance that hosts the FIMService database, or both.  The information below will help understand and tune the Asynchronous export on the FIM Service management agent so it runs more efficiently while allowing other types of requests to complete in a timely manner as well.

Because the processing of the FIM Service MA export requests can affect, and be affected by, other requests in the system, I went a little further in the information below than information just for the FIM Service MA configuration.  The reason for this is we need to balance request processing in the FIM Service for all different types of requests, with an emphasis on providing consistent performance across the board.  In effect, we want to reduce the impact of the processing spikes, so that we experience more consistent performance throughout the day.

This is not a comprehensive look into the topic of balancing request processing in the FIMService, as there are simply too many factors to consider for a document of this scope.  Instead, this is an overview of the basics with some examples provided, to help in the testing and modeling of settings in a FIM solution.

Default Configuration for FIM MA Export

The default configuration for export on the FIM Service management agent is:Asynchronous export

  • Aggregate Requests

Asynchronous export

At a high-level, asynchronous export is executed in the following manner:

  • The FIM Service MA exports data to the SQL Broker Service queue in the FIMService database
  • Each FIMService instance continually polls the SQL Broker Service to pull any requests that are in the queue
  • The FIMService, when the request has completed, writes the result of the request back to a different SQL Broker queue
  • The FIM Service MA queries the SQL Broker results queue and posts the results in the run history for that Export run

Aggregate (or Composite) requests

Aggregate request show in the FIM IdentityManagement Portal as msidmCompositeType (“composite” for this document) requests. By default, an export operation in the FIM Service MA will process requests in a “batch” format where it restructures like changes into a single composite request.  Each of these composite requests can contain up to 1000 like changes.  Examples of changes that will be segregated into different composite requests are Create, Update, and Delete operations.  A Create operation cannot exist in the same composite request as an Update operation.

Settings that control the FIM Service MA Export and Request Processing

Because the asynchronous export on the FIM Service MA is processed by two different components, it stands to reason that we would have settings for each of these components that control how the asynchronous export acts.  For this operation, we find settings in both the MIIServer.exe.config and Microsoft.ResourceManagement.Service.exe.config files.

MIIServer.exe.config

In this configuration file, we control:

  • If the export is run in Asynchronous or Synchronous mode
  • If the Asynchronous mode export will create Aggregate or Composite requests
  • How many changes can be contained in a single Aggregate of Composite request
  • If acknowledgements of individual requests will be reported upon completion of the request, or on completion of the export run
  • How many requests can be placed into the SQL Broker service queue at one time (i.e. “batch-size”)
    • Once the FIMService completes the current batch of requests the FIM MA will export another batch

Please Note: The Aggregate request settings will only be evaluated if the FIM MA is configured for Asynchronous export.  Aggregate or composite requests are not supported in Synchronous export operations.

Settings:

The following settings will be contained in the resourceSynchronizationClient section of the MIIServer.exe.config.  Here’s an example of using these properties with the default settings for each value.

<resourceSynchronizationClient asynchronous="true"     aggregate="true"     aggregationThreshold="1000"     delayUpdateAcknowledgements="false"     exportRequestsInProcessMaximum="50"     exportActivityTimeoutInSeconds="600" />

Microsoft.ResourceManagement.Service.exe.config

In this configuration file, we control:

  • If the local FIMService instance will process requests exported by the FIM Service MA in Asynchronous mode
    • Also known as “Synchronization Requests”
  • How many synchronization requests will be processed at the same time by this FIMService instance
  • SQL read and write timeout values for processing the synchronization requests
Settings:

The following settings will be contained in the resourceManagementService section of the Microsoft.ResourceManagement.Service.exe.config file.  Here’s an example of using these properties with the default settings for each value.

<resourceManagementService externalHostName="myFIMServer"     receiveSynchronizationRequestsEnabled="true"     maxSimultaneousSynchronizationRequests="6"     synchronizationDataReadTimeoutInSeconds="1200"     synchronizationDataWriteTimeoutInSeconds="1200" />

Times we may need to adjust away from the default settings

Depending on the server topology, FIM solution, number of changes being processed, as well as number of errors experienced during export, we may want to adjust away from the default settings for the FIM Service MA export.

Potential Issues

Lots of Exceptions or Errors on Export

With an aggregate or composite request, if a single change in the 1000 changes possible fails, it causes the whole composite request to fail.  To recover from this situation, the FIMService will then process each change as an individual request, rather than again as a composite request.  If this happens frequently, there are a couple of options we have to combat this situation:

  • Reduce the number of changes per aggregate request
  • Turn off Aggregate requests all together

Which option to take will need to be determined through testing and observation with each solution implemented.

Performance decreases substantially during the FIM MA Export run

This gets a little more complex, as more factors are involved.  We need to look at:

  • How many FIMService instances are processing synchronization requests?
  • How many synchronization requests are being processed at the same time?
  • Do we have a lot of composite request failures that result in the reprocessing of changes individually?
  • Requests seem to be taking a long time to make it through the “Validating” stage – as observed in the Search Requests page of the FIM Portal.
Things to consider doing:
  • Decrease the number of FIMService instances that are processing Synchronization requests
    • Many customers isolate the synchronization request processing away from the FIMService instances that are used for the FIM Portal
  • Decrease the number of synchronization requests that are processed at the same time by a FIMService instance
    • This will reduce the processing spikes the export operation has on the FIMService and SQL Server instances
  • Decrease the number of changes that are processed in each aggregate/composite request
Possible Resolutions
  • Configure all FIMService instances hosting the FIM Portal to not process synchronization requests

Example – Disabling synchronization requests in a FIMService instance

In the Microsoft.ResourceManagement.Service.exe.config file:

<resourceManagementService externalHostName="myFIMServer"     receiveSynchronizationRequestsEnabled="false" />

 

  • Configure one or more FIMService instances with reduced processing for synchronization requests

Example – Reducing the processing spikes created in the FIMService during Export

In the Microsoft.ResourceManagement.Service.exe.config file:

     <resourceManagementService externalHostName="myFIMServer"

          receiveSynchronizationRequestsEnabled="true"

          maxSimultaneousSynchronizationRequests="2"

          synchronizationDataReadTimeoutInSeconds="1200"

          synchronizationDataWriteTimeoutInSeconds="1200" />

In the MIIServer.exe.config file:

     <resourceSynchronizationClient asynchronous="true"

          aggregate="true"

          aggregationThreshold="200"

          delayUpdateAcknowledgements="false"

          exportRequestsInProcessMaximum="50"

          exportActivityTimeoutInSeconds="600"  />

  • Another option we have is to disable aggregate requests altogether, if you find the processing of aggregate requests to fail too often, due to data errors

Please Note: Even though we’ve disabled aggregate requests in the resourceSynchronizationClient below, this does not negate the settings for maxSimultaneousSynchronizationRequests, as this will apply to all asynchronous export requests.  The example below both disables aggregate requests and reduces the number of simultaneous synchronization requests processed by a single FIMService instance to 2.

Example – Disabling aggregate requests

In the Microsoft.ResourceManagement.Service.exe.config file:

<resourceManagementService externalHostName="myFIMServer"     receiveSynchronizationRequestsEnabled="true"     maxSimultaneousSynchronizationRequests="2"     synchronizationdataReadTimeoutInSeconds="1200"     synchronizationdataWriteTimeoutInSeconds="1200" />

 

In the MIIServer.exe.config file:

<resourceSynchronizationClient asynchronous="true"     aggregate="false"     delayUpdateAcknowledgements="false"     exportRequestsInProcessMaximum="50"     exportActivityTimeoutInSeconds="600" />

 

Other things that should be considered when looking at performance in the FIMService

When we look at the processing of requests in the FIMService, there are other things that can have an effect on performance.

TempDB Configuration

Request processing uses the tempDB intensely.  This is particularly true during the Validating stage of request processing.  Here it uses tempDB for:

  • Dynamic set transitions
  • Dynamic group transitions
  • Matching Management Policy Rules (MPR) against the request being processed

In many cases, customers have chosen to maximize their tempDB throughput by creating additional tempDB data files so the number of tempDB datafiles matches the number of processor cores available to the SQL Server instance.  This can help reduce a bottleneck in the tempDB for request processing.

SQL Server documentation recommends increasing the number to 8 rather than higher, until additional testing is done to show that having a higher number of files is necessary. Otherwise, there could be a point of diminishing returns with more than 8 tempDB datafiles.

There are some FIMService solutions, however, that have benefitted from configuring more than just 8 tempDB datafiles. Additional testing should be done under stress to determine if more are advantageous for a particular FIMService solution.

Segregating request processing into different FIMService instances

FIM Service Partitioning

Documentation –

http://social.technet.microsoft.com/wiki/contents/articles/2363.understanding-fim-service-partitions.aspx

FIM Service partitioning allows us to segregate the request processing of different types of operations into different FIMService instances or groups of instances.  A partition structure might be:

  • FIMServicePartition1
    • FIM IdentityManagement Portal requests & approval processing
  • FIMServicePartition2
    • Self-service Password Reset request processing
  • FIMServiceAdminPartition
    • System Event Requests
    • FIM Service MA Export Requests
    • All Administrative Tasks to be done in the FIM Identity Management Portal

System Event Requests

One type of request we haven’t discussed yet, is system event requests.  System event requests are those requests that are created by workflow instances as the result of another request being processed.  These have had several different names in the past, but share one commonality – they will always be created or originated by the FIM Service Account.

  • Child Requests
  • Collateral Requests

Executing a Run on Policy Update for a workflow could also create a substantial number of system event requests.

Settings

In the Microsoft.ResourceManagement.Service.exe.config file, the resourceManagementService section contains a property named processSystemPartition.  This is set to True by default, but can be used to keep a FIMService instance from processing system event requests by changing this setting to false.  Below is an example of this setting with the default value.

Example:

<resourceManagementService externalHostName="myFIMServer"      processSystemPartition="true" />

In order to allow users to have the best possible experience in the FIM Portal, many customers choose to use FIMService Partitioning.  They also may use the receiveSynchronizationRequestsEnabled setting to stop the FIMService instances hosting the Portal from processing FIM MA Export requests.

Example Troubleshooting Scenario:

To illustrate the information above, let’s go through a scenario where we might use this information.

Problem:

At times, the FIMService performance seems to get very poor, affecting both users in the FIM IdentityManagement Portal, as well as performance on the FIM Service management agent.

Assumptions:

  • There are two FIMSErvice instances that are hosted on the FIM Portal machines.
  • All settings for the FIM Service MA Export configuration are default
  • FIM Service administrators frequently have to execute operations that cause Run on Policy Updates to be triggered, sometimes during the working day

First Looks:

  • Check to see if the performance issue only happens during the FIM MA Export processing
  • FIMService Throttling – Symptoms could include performance slow-down, requests staying in PostProcessing stage for longer than expected, etc.
    • Check the processor usage on the machine(s) hosting the FIMService
      • Are they processing at 75% consistently?
      • If so, this could indicate that the FIMService is being overloaded, and throttling of the FIMService has been engaged
    • Check the processor usage on the machine hosting the SQL Server instance that contains the FIMService database
      • If the processor usage is at 75%, this could also trigger FIMService throttling
  • In the Search Requests screen in the FIM IdentityManagement Portal, check for an unusual number of requests in the following states, as they could be warning signs for processing over-load:

 

Request State Symptoms Possible Area/Setting
Validating
  • Requests stay in Validating stage for more than a second or two
    • Composite requests will take longer
  • Requests time-out and fail
    • SQL timeout errors in event log
  • Requests get denied due to no MPRs being associated
  • Check tempDB in SQL Server to make sure it’s optimized
  • Are composite requests being processed when problem happens?
    • Reduce maxSimultaneousSynchronizationRequests value
    • Reduce aggregationThreshold value
    • Set aggregate="false"
Authorizing NOTE: It could be valid for requests with Authorization workflows to stay in this state for an extended period of time.  
  • Requests with authN workflows that should complete immediately are not completing
  • This could be caused by the AuthN workflow instance being persisted to the database
  • This could indicate that the FIMService instance(s) or SQL Server is overloaded.
PostProcessing NOTE: As the processing of workflows is an asynchronous processing model, it’s not uncommon for requests to stay in the PostProcessing stage for a period of time. Anything that could cause a workflow instance to be persisted to the database for future processing can cause this to happen
  • Again, we’ll want to look at the possibility that the FIMService is overloaded

Findings

In this scenario, let’s assume we found that performance suffered during two different times:

  • When the FIM Service MA Export operation was run
  • When the Run-On-Policy-Update operation was run

Possible Resolution:

Based on the needs of the business, it was determined that nothing should directly interfere with the performance of the FIM Identity Management Portal, therefore, the following recommendation was made:

  • Install an additional two FIMService instances
  • Configure FIM Service Partitioning – two FIMService instances per partition
    • FIMServicePartition1 hosts the FIMService instances for the User-Facing FIM Portal
    • FIMServicePartition2 hosts the FIMService instances for processing
      • FIM MA Export (Synchronization) Requests
    • System Event Requests
    • Administrative requests in the FIM Service Portal
  • In FIMSesrvicePartition1
    • Configure servicePartitionName setting in the resourceManagementService section of the Microsoft.ResourceManagement.Service.exe.config file
      • Set a partition name that will be used for all FIMService instances that will be hosting FIM Portal requests
    • Disable receiveSynchronizationRequestsEnabled
    • Disable processSystemPartition
  • In FIMServicePartition2
    • Configure servicePartitionName setting in the resourceManagementService section of the Microsoft.ResourceManagement.Service.exe.config file
      • Set a partition name that’s different from that used in FIMServicePartition1
      • Use this name for each FIMService instance in this partition
    • Enable receiveSynchronizationRequestsEnabled
    • Enable processSystemPartition

Why two FIMService instances in FIMServicePartition2?

Because we’ve turned off processing of System Event requests and Synchronization requests in all FIMService instances other than those in the FIMServicePartition2, it would be safer to have two FIMService instances for high-availability.

If we configured a single server to do all system event and synchronization request processing and that one instance was down, then we could not process FIM MA Exports and the creation of system event requests would fail.

FIM Service Partitioning Reference

http://social.technet.microsoft.com/wiki/contents/articles/2363.understanding-fim-service-partitions.aspx