Migrating offline files (CSC) using USMT 4.0

One of the questions that came up after my TechEd New Zealand session on USMT 4.0 was whether USMT migrated the contents of the client-side cache (CSC) used for offline files.  Well, it sounds like it “sort of” does – but by default, it only moves the “dirty” files (those not yet sync’d to the network location).  That’s a decent default I suppose, as the remaining files can be pulled back from the network after the state is restored, and the modified files won’t be overwritten.  So there’s no data loss (always a good thing), but there will be extra network traffic to pull the content down to the cache again.

The actual cache migration is performed by a plug-in to USMT, so the question is whether that plug-in can be influenced to capture everything, instead of just the “dirty” files.  From https://support.microsoft.com/kb/942960, you can adjust the behavior by telling CSC you want to migrate everything.  (While this article talks about MigWiz.exe, the Windows Easy Transfer Wizard, the underlying engine being used is basically the same as that used by USMT.  So the end result of setting “MigrationParameters” should be the same.)

But before you say “great, let’s do it” you need to understand what’s going on behind the scenes.  First, this CSC migration plug-in is called automatically by USMT as part of the Windows manifest processing.  If you search through the Scanstate log you’ll see lots of references to it, with “CscMig” in the log entries.  For example, here is an entry I saw on my computer for a “dirty” file (one that was created while the folder was offline, so USMT needs to capture it):

2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractItem(1124):enter: Processing item Created while offline (dirty).txt continueCtx = 00000000003DC480

But the rest of the files in that folder were skipped, as I didn’t have the MigrationParameters registry key set:

2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractFile(653):Skipping item File1.txt because it is in sync with remote location. ItemStatus = 00050020

OK, great, we see what’s happening.  But it’s also worth digging a little deeper and seeing what it did with that original file (something you can see from specify verbose logging, /v:5):

2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigGetWorkingDirectory(837):exit: workingDir = <\??\C:\Users\mniehaus\AppData\Local\Temp\tmp6865.tmp\Working\agentmgr\CCSIAgent\CSC>, status = 0x00000000 ( EE = 0 )
2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractItem(1124):enter: Processing item \\ continueCtx = 00000000003DAF00
2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractItem(1124):enter: Processing item bdddev continueCtx = 00000000003DB9A0
2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractItem(1124):enter: Processing item data$ continueCtx = 00000000003DC480
2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractItem(1124):enter: Processing item Created while offline (dirty).txt continueCtx = 00000000003DC480
2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigWrite(446):exit: bytesWritten (743) at offset (0)
2011-08-24 22:53:43, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractFile(974):Backup API file content till offset 743

So it used the Windows Backup API to make a backup of the “dirty” file in the CSC cache and placed that backup into a folder in my %TEMP% directory named “tmp6865.tmp\Working\agentmgr\CCSIAgent\CSC”.  So even though I’ve specified to use hardlinks, there is data copying going on.  In the end, this temporary folder created by CscMig is included in the hardlinked state store, but because these are different files (the backups) you will see twice the disk space consumed, so if you have lots of cached data you better have lots of free disk space to store the data.  This isn’t so bad in the default configuration where it is only grabbing “dirty” files, but if you tell it to backup all files, then what happens?

No better way to find out what will happen than to try it.  I made the MigrationParameters registry entry as described in the KB article mentioned above and repeated the Scanstate execution.  (No service restart was required.)  Upon checking the log, I can see now that each file in the CSC was backed up into the temporary folder.  Where before it said “skipping item File1.txt”, now it says it’s backing it up:

2011-08-24 23:07:54, Info                  [0x0808fe] Plugin {0db12ccb-7cfd-46b6-b4d1-daa6ff0fbcf7}: CscMig: CscMigpExtractItem(1124):enter: Processing item File1.txt continueCtx = 000000000045C480

And like before all of these items get backed up into a temporary folder, then that temporary folder is hardlinked into the state store folder.  So hardlinks or not, if you had 2GB worth of cached files, you’ll end up with those being doubled (temporarily, until the process is complete and the temporary folder and state store are cleaned up).  It’s actually going to be tripled if you aren’t using hardlinks:  first copy is the original file, second copy is the backup, third copy is contained in the compressed state store.

So that definitively answers the question of whether you can get USMT to migrate the complete contents of the client-side cache.  It may not answer the question of whether you should do that, but hopefully the information is useful to help you make that determination yourself.