Stefan Goßner

Senior Escalation Engineer for SharePoint Products and Technologies

Deep Dive into the SharePoint Content Deployment and Migration API – Part 5

[Part 1Part 2Part 3Part 4Part 5 – Part 6 – Part 7]

Avoiding common problem

This can be one of the most frequently updated chapters of this article series as “common problems” can change over time. In addition I can only talk about common problems I have seen so far. But let get started.

Problem 1: Mixing deployments with and without retaining object identity

First of all: if possible you should avoid this! Importing with different settings for this property into the same database can lead to serious problems during future deployment and such databases will become hard to maintain.

Be aware that this also means that you should not use STSADM -o import against a database that should be used as the destination of a content deployment job!

STSADM -o import will not retain the object identity while content deployment does.

So why is there a problem when mixing imports with different RetainObjectIdentity settings?

The reason is that with RetainObjectIdentity enabled the imported object will have to be created with the same name and the same guid at the same location in the destination database as it was in the source database. If the item already exists it will be updated. If not it will be created.

Problems occur if there is an item with the same name but a different Guid in the destination database. This can happen if someone has authored on the destination server and created items with the same name but or if he imported content from the source server WITHOUT RetainObjectIdentity setting set to true.

In case that items with the same name but different GUID are allowed for the affected item you will end up with two items with the same name on the destination server. This will be the case (e.g.) for usual ListItems.

In case that items with the same name but different GUID are not allowed for the affected item the import will run into an exception similar to the one below and stop:

Failed to create the ‘Pages’ library. OriginalException: There can only be one instance of this list type in a web.

=> In order to avoid this problem you have to guarantee that content added to the destination database will not have any name/guid conflicts with the source database – even if new content is added to the source database in the future!

Problem 2: Running multiple imports without retaining object identity for updates of the same content

When doing an export and import without retaining the object identity on the destination server you can end up with duplicate items in lists as each import tries to create the same list item again with a different GUID. The import is not able to decide whether you would like to overwrite an existing item with the same name or if you would like to have multiple list items with the same name. Without retaining the object identity you will end up with multiple list items with potentially the same content. To force overwrite of list items you have to retain the object identity.

That means you cannot use STSADM -o export/import as a replacement for content deployment! If you need to do deploy content to a remote destination server without connectivity you need to write a custom tool that has retain object identity enabled rather than using STSADM -o import based on the code samples provided in Part 3 of this article series.

=> STSADM -o export and import should only be used if the content being imported does not already exist in the destination database and if the database will not be used as the destination database for content deployment (see Problem 1 above).

Problem 3: delete an item from the source site that belonged to the site definition and recreate it

This is a different flavor of the problem discussed as Problem 1

During provisioning of a site items defined in the site definition template are added to the site. Problems can occur when changes are made to the provisioned items. Especially if the provisioned items are deleted and replaced with items with the same name. That approach will work well on a single server installation. But it will cause problems when using content deployment.

The reason is that during content deployment the site will be provisioned on the destination server using the site definition template. And this will also cause all items defined in this template to be created. When content deployment now tries to import the updated or replaced items there will be a conflict. You will end up with an exception similar to the one in Problem 1.

=> You should never modify or delete one of the items created through the site definition in your site. If the site definition does not suite your needs you should create a custom site definition that fits to your needs and use this instead to avoid the need to customize some of the provisioned items.

Problem 4: deploy from destination back to source

This is something that theoretically can be done but only if the source hasn’t changed since it was last deployed to the destination. Otherwise the same issues as in Problem 1 can occur.

Also be aware that it will not be an incremental deployment – means you cannot just deploy the changes since you deployed from source to destination. The reason is that the timestamp information about what to deploy is stored with the deployment job. As this information only exists on the source system the first deployment from destination back to source will deploy everything! So the result would be the same as deploying into an empty site collection on the soruce system. And actually deploying into an empty site collection would be better to avoid problems in case that changes have been done on the source system.

Problem 5: deploy partial content without exporting the parent items

When deploying with retaining the object identity (as you can do with content deployment in the central admin) it is not possible to reparent items. Deployment with retaining the object identity requires that the identity of the object is the same on the destination server and the identity is defined by Id, Name and by the Url.

So the parent of each deployed object has to exist on the destination server in order to successfully import the package on the destination server. We have seen that customers are trying to export a specific subtree of the site without exporting the parents. E.g. only a specific variation label without the variation root.

If the parent of the exported objects does not exist on the importing site then the item cannot be imported and the deployment will fail.

=> Ensure that all parents of all content being exported exists on the destination server.
=> Or create a custom export tool that does not retain object identity and changes the parent during import as discussed in Part 3 but be aware about the limitations discussed as Problem 2.

Problem 6: deploy partial content with references outside the selected scope

This is similar to Problem 5 except that we assume that the parent objects of the selected object exists in the destination database. In this situation you might assume that no problems should occur. That is not correct. When exporting items in a subtree per default all referenced objects (like images or documents) will be exported as well. Even if these objects are outside the selected scope. In this situation the export package will contain objects which might not have a parent in the destination database. 

If the parent of this image or document does not exist on the importing site then the item cannot be imported and the deployment will fail.

=> Ensure that authors do not use resources from other parts of the site collection which is not being exported.
=> Or create a custom export tool that uses the ExcludeDependencies to exclude objects outside the selected export scope. See Part 2 for more details.