The Case of the Failed File Compression


The other day Bryce tried to use Explorer’s Send To Compressed (zipped) Folder feature, seen below, to package up his latest Process Monitor source code updates to send me.



Instead of presenting compression progress dialog followed by an opportunity to edit the name of resulting compressed file, Explorer aborted the compression with this error:



Bryce was perplexed. The error didn’t seem to make any sense because he obviously had read permission to the files in the selection, which he’d just finished editing, and compressing files shouldn’t involve some kind of search that could result in a file not being found.  He retried the compression operation, but got the same error, this time after a different number of files had finished compressing.


I happened to walk into his office at this point and he showed me the behavior by trying a few more times, all with the same outcome.  Now both of us were perplexed. It was time to investigate, and the tool we called on for the job was, somewhat ironically, Process Monitor.


We launched Process Monitor, reproduced the failure, stopped the capture, and scanned through the thousands of operations in the trace looking for errors. We saw a slew of NOT FOUND errors near the start of the log, which are the generally innocuous result of an application checking for the pre-existence of a file. In fact, there were literally hundreds of them near the beginning of the log, all of which were queries for the file into which the compressed files would be placed:



That was disturbing, but not directly related to our troubleshooting effort, so I filed it away to look at later.


Several hundred events into the trace later we came across a SHARING VIOLATION error that bore a closer look:



When a process opens a file it can specify if and how it wants to share the file with other processes while it has the file opened. The three types of sharing are read, write and delete, and each is represented with a flag that a process passes to the CreateFile API. In the operation that failed, Explorer didn’t pass any of the flags, indicating that it didn’t want to share the file, as seen in the ShareMode field:



For an open to succeed, the sharing mode of the opener must be compatible with the sharing allowed by a process that already has the file opened, so the explanation for the error was that another process already had the file opened.


Looking back at the trace, the open operation immediately preceding the one with the error is an open of the same file by a process named Inort.exe. Inort’s close of the file isn’t visible in the screenshot because it comes long after Explorer’s failed attempt to open the file. That confirmed that Explorer’s unwillingness to share the file conflicted with Inort having the file open, despite the fact that Inort specified read, write and delete sharing in its open of the file.


Process Monitor had closed another case: Inort holding the file open when Explorer tried to open it was the cause of the sharing violation and almost certainly the reason for the misleading error message. Next we had to identify Inort so that we could come up with a fix or workaround. Process Monitor also answered that question with its image tooltip:



eTrust, Computer Associates’ Antivirus scanner, was apparently opening the file to scan it for viruses, but interfering with the operation of Explorer. Antivirus should be invisible to the system, so the error revealed a bug in eTrust. The workaround was for Bryce to set a directory filter that excludes his source directories from real-time scanning. 


I couldn’t reproduce the error when I went back to my office, so I suspected that I had a different version of Inoculan on my system than Bryce. Process Monitor’s process page on the event properties dialog for an Inort.exe event showed that Bryce had version 7.01.0192.0001 and I had the more recent 7.01.0501.000:



Why we have different versions isn’t clear since we’re both using images deployed and managed by Microsoft IT, but it appears that Computer Associates has fixed the bug in newer releases.


Now I turned my attention back to the inefficiencies of Explorer’s compression feature. I captured a Process Monitor trace of the compression of a single file and counted the associated operations. Just for this simple case, Explorer opened the target ZIP file 14 times, 12 of those before it had actually created the file and therefore with NOT FOUND results, and performed directory look ups of the target 19 times. It was also redundant with the source file, opening it 28 times and querying the file’s basic properties 17 times. It’s not like Explorer doesn’t give eTrust plenty of opportunities to cause sharing problems.


In order to verify that Explorer itself was at fault, and not some third-party extension, I looked at the stacks of various events by selecting the event and typing Ctrl+K to open the event properties dialog to the stack page:



Zipfldr.dll, the Explorer file compression DLL, was in most of the stack traces, meaning that the compression engine itself was ultimately responsible for the waste. Further, the number of repetitious operations explodes when you compress multiple files. There are clearly easy ways to improve the algorithm, so hopefully we’ll see a more efficient compression engine in Windows 7.


Update: I’ve learned that the compression engine has been updated in Vista SP1 to perform fewer file operations.


On a closing note, if you’d like to catch me at my next public speaking engagement, come to Wintellect’s Devscovery conference in Redmond, August 14-16, where I’m delivering a keynote on Vista kernel changes.


Comments (45)

  1. Anonymous says:

    Explorer is woefully slow at manipulating ZIP files, and this could be one of the reasons. At least in Windows Vista, it can handle 64-bit ZIPs and 64-bit files in a ZIP, which it can’t in Windows XP.

    I thought I’d reported its slowness at extracting files in the Windows Vista beta, but since the main feedback site wasn’t open to me as a public beta user, I couldn’t see whether my feedback had even been submitted correctly. If you want a public beta to be useful, you have to make it possible for public beta users to submit feedback and continue the conversation.

    For the moment, WinZip is a much better solution (although its context menu entries are a pest when you’ve copied a large file in a Remote Desktop session, have the disk sharing option turned on, and right-click an Explorer window in your own desktop, because it downloads the file to work out which icons to show!)

  2. Anonymous says:

    Actually, I’m in Redmond (I’ve been here since the Winternals/Sysinternals acquisition in late July 2006). There are still large sections of Microsoft on eTrust, including the NTDEV domain.

  3. Thomas Muders says:

    It’s not really new that this feature in Windows sucks. I know only XP but you can make the test yourself: extract an archive with a lot of files once in e.g. WinRAR and then with Explorer. With Explorer it’s unbelievably slow which seems to be caused by the lot of unnecessary operations you found. Maybe this is by design to still give vendors like WinZIP or WinRAR a reason to sell their product because it’s vastly superior to the Windows built in feature…

  4. Pontus says:

    I like the fact that Microsoft is NOT using OneCare for its employees:) . Also I find your articles where you troubleshoot stuff very interesting.

  5. awkse says:

    I can’t say I’m surprised by your finding.

    I gave up on the compression built into XP looong ago since it’s so unbearably slow.

  6. Wowexec says:

    Brilliant diagnosis!!!

  7. Claus Valca says:

    Am I correct in assuming this was observed on a Vista OS?  The screen-captures seem to suggest that.

    I wonder if XP’s built-in compression is any more or less efficient by comparison to Vista.

    I’ve always gone ahead and used third-party compression utilities on my XP/Vista systems simply out of habit.  But having one available on the system by default makes working on end-user’s workstations a bit easier.

    This is an interesting look at that process.

    Thanks for the always fascinating detective work!

  8. Nick says:

    Mark, we’ve seen that problem before as well when we’re trying to compress files for our nightly polling process on our XP machines.  They’re used as POS terminals and have McAfee Virusscan loaded.  We kept getting sites that failed polling or had corrupt files until we figured out that McAfee was trying to open the files and scan them at the same time our polling process was running.  McAfee has yet to fix the bug in their app, so we just set an exclusion on the folder as well.

  9. Paul Williams says:

    Not the first time you have found explorer performing the same file operation multiple times.  At least it’s only into double figures; I seem to recall in a previous blog you found an instance of explorer performing the same operation 100’s of times.

    Just think if Microsoft found and removed all this waste, we would probably being happily running XP on 500 Mhz machines and Vista on 1 Ghz machines.

    Keep up the good work Mark.

  10. Dominik Weber says:

    As always – excellent write-up.

    BTW

    http–blogs.technet.com-photos-markrussinovich-images-1702272-original.aspx

    is broken. No picture there!

  11. Hoe Shmoe says:

    Uhh, Mark, maybe its time to start using source control. Isn’t that one of those standard Microsoft Quality Gates anyway?

  12. Jonathan says:

    Wow, that sucks. Note to self: Don’t compress directly to remote network shares!

    Now that you work at Microsoft, I presume you get to look at the source code, so you can see the actual offending code. Do you file bugs against Windows when you find such issues?

  13. Zaki Mirza says:

    I had almost exactly the same issue with WinRAR at most of the workstations at university labs. Whenever i asked it to compress a given setof files, it returned a dialog saying "no files to compress". There was a workaround to that which some students invented, but having read your case study of windows compression, i hope i will be able to address the issue downright :) Im an avid user of process explorer and process monitor at my home mostly using it to check activity of my PC and my software projects, though now you have given me an insight into how to troubleshoot with your magical softwares:) I wish i could attend your presentations on 14th/16th but im just like…. 10,000 miles away :( I hope we cud get a recording of it or notes online (youtube anyone?)

  14. James says:

    XP’s compression engine is unbelievably inefficient.  Decompressing the Boost source code archive (a 23 MiB zip file, 85 MiB uncompressed, containing ~12000 files) takes 1.5 minutes on my machine with WinZip.  Using XP’s decompressor, it takes a mind-numbing 45 minutes.

    Just how improved is it in Vista?  Is it on par with other compression software now?

  15. Shaun says:

    Thanks Mark, that was very informative.  I love those kind of examples as I’m an avid user of all your tools.  Cheers.

  16. Pete says:

    Interesting that Microsoft IT do not use their own ‘One Care’ anti-virus solutions…

    When is Vista SP1 due?

    Mark – Thanks for another interesting blog. Keep up the good work.

  17. fat_hot says:

    Sorry for the grammar pedantry: "bared a closer look" should be "bore a closer look".

  18. JC says:

    Great story as usual !

  19. Hugh McColl says:

    Hi Mark,

    A couple of remarks caught my attention:

    "Antivirus should be invisible to the system, so the error revealed a bug in eTrust."

    and your observation that the bug was fixed in a later release.

    I am curious as to what APIs are available to AV software to allow it to open or otherwise access files without interfering with the operation of other applications? We are a software vendor and some of our customers have reported issues where AV software interferes with our software in a similar way to the scenario outline above. Eventually we resorted to retrying file opens a certain number of times when access violations were encountered.

    Any further insight into this issue would be valuable.

  20. SteelBytes says:

    I’ve noticed that Windows Search in Vista also can interfere with regular file ops  :-(  

    (I’ve filed a bug about this on connect.microsoft.com)

  21. Lucke says:

    I always install Total Commander as the first program after a clean install and never, ever, use Explorer for anything filerelated. Will not start in the near future either by the looks of things…

    BR

  22. Paul Winterburn says:

    How out of date are Microsoft with product patches? The CA cumulative fix from build 192 to 501 was released on 1 October 2005.

  23. Rob says:

    I don’t know if this is Mark’s blog so much as Process Explorer’s blog. :P

  24. Cosmin says:

    I am sure Microsoft did a great job having you on their team but, may we have your discoveries/improvements included into Windows OS sooner than Windows 7? I think is pretty obvious Vista SP2 could include this fix and not wait for Windows 7!

  25. Jim says:

    Try winzip. It works great.

    Even with "fewer" file operations, it will still be no where as fast as winzip.

  26. Zeroes says:

    Mark thanks for this note….

    I know what any soft with drivers (Antivirus/Firewall/…) may cause problem with system.

    And i too have randevouz with problem Kaspersky Antivirus 4-5 version, Outpost Firewall old version…

  27. cryptomancer@yandex.ru says:

    Hello, dear Mark !

    First, i thank you very much for your Sysinternals Suite – it’s very useful toolset for almost every Russian sysadmin.

    Now are the questions:

    1) If i have a tree full of zero-filled files (like ones created by p2p clients but not yet completed with download), how can i make them all sparse and reclaim zero-filled space until completion ?

    2) How can i tell the system to create ALL future files under given folder as sparse ?

    3) What archiver can compress and extract sparse files without "unrolling" them ?

    If where is no easy way to do so, may you write one more utility for your awesome suite ? All p2p users will be thankful for you !

  28. Jason Gurtz says:

    It’s amazing to me how desktop AV products still cause these same types of problems.  Even more amazing is that people still use them.

    When was the last time someone got a virus from a floppy disk or similar?  I don’t think I’ve seen a virus/Trojan/Worm in the last 5 years that came from other than an email or browser window (easy to stop at the smtp gateway and via IPS hardware).  Sure there’s network based attacks, but who in their right mind runs a windows box without putting it behind a hardware firewall.

    Malware has changed, time for the AV folks to catch up…

    P.S. Isn’t it quite humorous that it’s CA and NOT Forefront in use here?

  29. Tom says:

    Interesting analysis, as usual.  Thank you for sharing it!

    Regarding the astounding inefficiency of Explorer’s ZIP compression, I rather optimistically ascribed it to MS throwing a bone to the likes of those who sell WinZip and WinRAR.  "Why should I buy your software when I could just use what’s built into Windows?"  "Because that stuff is horribly slow – just try compressing more than a handful of files.  You’ll be there all day!"

    It is interesting to hear that this problem might actually be solved in Vista SP1.

  30. Josh says:

    Excellent, as always Mark. I think that this particular entry in your blog illuminates some of the legitimate problems antivirus solutions can provide. But it is great that it has been recognized and fixed in later versions!

    I also like that us peons can try the stuff you do, for example, I just watched Explorer and the zipfldr dll perform the incredible amount of file operations on my own computer.

  31. Shri Ganesh says:

    This is great! You use procmon so well. Can you please tell me where can I get more information on the actions generated in the "Result" by procmon? For example, NAME NOT FOUND, SHARING VIOLATION, NAME COLLISION. It would be great if the information on these topics are in-depth.

    Thanks…

  32. Nick says:

    Just thought I’d throw out that I enjoy your blog and appreciate the time you put into it.

    Thanks.

  33. FrancoK says:

    Mark, you are great as usual.

    Now, what about the version of Zipfldr.dll you were using and the new Vista SP1 Zipfldr.dll?

    Do you know if MS is going to fix it in XP SP2c as well?

  34. Triangle says:

    Agreed with everyone else that the built in support for file compression is unbearably slow.

    I’ve also noticed that when I right click on a file in explorer, and mouse over "Send to", explorer will freeze for a few seconds before the options come up. It might be related to this.

    Anyway, as usual great post! I love seeing those stack traces.

  35. Lavin says:

    Hello Mark,

           I’m a regular visitor to your blog and a big fan of all your work! I would love to see more crash dump investigations.

    Thank You!

  36. Adam says:

    I am actually surprised you guys are.

    1. Not using source control for snapshots; and

    2. Using the build in zip compression

    The compression / decompression is good enough for the odd compress-decompress task, but it just isn’t up to par with other offerings.

    It seems to easily take 20-30% longer than 7-zip to extract a zip archive.

  37. Mark W says:

    Excellent blog and toolset, thank you – I must use ProcMon more frequently, but get a bit overwhelmed by the mass of info, so FileMon is still my regular file i/o spy. Interesting that although Explorer is reporting the excessive file i/o, it is the compression program which is inefficient. I had similar probs trying to convince a software supplier that it was their product and not the Windows DLL which was inefficient. I’m sure many programmers curse the Russinovich file i/o tools for exposing their shoddy coding.

  38. Desi says:

    This blog was informative and interesting like any of your other blog. Few mins here help me gain immense knowledge. Thanks for sharing it will world.

    FYI, I have been strict user of almost all of your tools for past 3-4 years :)

  39. tras says:

    For the record, Microsoft does in fact use its own antivirus software… just not OneCare, per se.  

    OneCare is a consumer product, in the same way that "Norton Antivirus 2008" is for home users.  The corporate product is known as "Forefront Client Security", which is analogous to "Symantec Antivirus Enterprise Edition" and is in heavy rotation throughout Redmond across 30,000+ systems.

    So why is Mark using eTrust still?  Well, most of the people outside of Redmond aren’t using FCS yet.  It’s just not deployed everywhere yet as it was RTMed just earlier this year, and it’s not likely that Microsoft will ever use it’s own AV/AS exclusively being that eTrust is doing the job well and in technologies like these Microsoft often uses multiple products, in the same way that there are HP, Dell, IBM, and Toshiba computers in rotation within IT.

  40. Robbie Mosaic says:

    On my work computer Innoculate Kill is also used as the default anti-virus program, which is a quite old version. It also interferes with Visual C++ 6.0 (file share violation) and Recycle Bin (refreshes a lot). It’s quite interesting.

  41. MikB says:

    Your blog suggests an approach to a number of other errors/hangs that I’ve been having (XP of course). Thanks!

    BTW, just replaced Winzip (eval period expired) with 7-Zip. MASSIVELY faster!

  42. Tim Steele says:

    Second vote for 7-Zip. Free, open source, handles RAR files, and so much faster than the ZIP capability built in to Windows.

    http://www.7-zip.org/

  43. TREUTRONICS says:

    Hi, very helpful for me,

    the 8.1 Version of ETrust (from BPS 3.1 e.g.) has the same Problem. I excluded the zip Files from scanning, now it works again. Could be a Solution.

    Greetings

  44. Arioch says:

    > to still give vendors like WinZIP or WinRAR a reason to sell their product

    …and 7-zip/jZip are evil monsters to wipe them out :-)

    I think, no one would care about built-in archiving speed, it is not big problem novadays. The fact that only ZIP is supported in Windows of all the compression formats – that is main advantage of 3rd party vendors

  45. Richard FDisk says:

    interesting info;

    think the copy into a .zip failure is scary

    I’ve had this happen on more than one occasion

    a> create new .zip folder

    b> grab & drag more than 10 large files

    c> select "move"

    d> move completes all but the last file

    e> dialog pops up "archive is corrupt"

    f> all files lost not in recycle bin a lot of the files were larger than the recycle bin could hold when set at 1%.

    happened on an XP Pro machine with McAfee running (office PC)

    happened also on a stand alone never connected to the internet clean XP Pro machine with no AV running or installed (My Home PC)

    so I ordered WinZip for myself and told IT to get me WinZip for the office PC also and it hasn’t happened since.