WSS/MOSS – Common Issue – SQL Deadlocks during STSADM import operations

In the last couple of weeks I have seen several cases where STSADM import operations failed with random exceptions. With other words: performing the same import into an empty site collection multiple times the import operation failed at different points during the import. Checking the ULS logs showed errors like the following:

10/20/2008 12:47:26.59 STSADM.EXE (0x78BC)                         0x4FF4 Windows SharePoint Services            Database                        6f8g      Unexpected       Unexpected query execution failure, error code 1205. Additional error information from SQL Server is included below. “Transaction (Process ID 123) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.” Query text (if available): “…”         

Such a behavior is usually an indication that asynchronous actions interact with the import operation and caused a deadlock in SQL server.

Another interesting tidbit is that this only affects STSADM import but not content deployment. 

Isolating these issues is not very easy as SQL asap kills one of the deadlocking queries and the child process (in our case STSADM.EXE and potentially a second process) continue to run till they finally fail due to the fact that the SQL query did not succeed.

In a test environment it is possible to isolate the issue by attaching a debugger to the SQL server and setting a breakpoint right before the deadlock victim is killed. That causes the deadlock to persist and allows to take memory dumps of STSADM and the other involved processes.

In the cases I have worked on the problem was always caused by a custom event receiver fireing when importing the items. That also explains why only STSADM -o import is affected but not Content Deployment: with STSADM -o import the After event handlers will fire while Content Deployment suppresses After events through the import settings SPImportSettings.SuppressAfterEvents.

Unfortunatelly STSADM does not provide an option to suppress the after events. So there are two possible way to resolve the issue:

  1. Disable the event receivers in the features on the target machine when performing the import
  2. Create a custom import application which uses the content deployment and migration API as discussed in Part 3 of my Deep Dive into the content deployment and migration API series and sets the SuppressAfterEvents property to true.

5 Comments


  1. We have a similar problem only no Import is taking place- just our EmailRecieved event handler. We have SP1 and Infrastructure update installed. We are trying to create and change item permissions (not inherited) when a mail message is recieved and sometimes get deadlocks on stored procedures implementing security operations. The list already has several thousand items and probably a very large amount of specific permissions (dbo.RoleAssignment has about 4 million records). Do you have any ideas perhaps?

    Reply

  2. Hi Shalom,

    sorry, this cannot be answered easily.

    It would be required to analyze the deadlock in more detail and verify the code of your event receiver to isolate the issue.

    If you need assistance for this I would recommend to raise a support call with Microsoft support.

    Cheers,

    Stefan

    Reply

  3. This happened to us. We managed to solve it by doing IISRESET’s and retrying the import. Eventually the import succeeded, so it probably was a timing issue.

    Reply

  4. Hi Stefan,

    great article!

    What debugger did you use to set the breakpoint and where did you attached it? – directly to the process of the sql instance?

    I am actually not to familiar with debuggers – but want to give it a try.

    As far as I know you usually set a breakpoint in your application at a pice of code or datastructure where you want to stop the execution – but how do you know where this will be? I mean wouldn’t you need to know when the deadlock occurs?

    Or do you set the breakpoint before the routine which will kill the deadlock starts its execution?

    And is this possible for a user without having in-depth information about the code and symbols of the process?

    Best regards,

    Flo

    Reply

  5. Hi Flo,

    I used WinDBG and set the breakpoint as follows:

    bp sqlservr!RaiseDeadlockError

    Cheers,

    Stefan

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.