ISA Server Firewall Service crashes when enabling cache


1. Introduction


 


This is another one of those cases where ISA Server Service mysterious crashes once a day, at the same time and nothing changed in the environment.  This just make me really fell that the lack of communication between the teams that deals with technology is getting far beyond of what should exactly be. Many companies are investing money in putting Security in place by adding layers and layers of technology but they are still missing two important elements: process awareness and change control procedures. The absence of those elements can directly impact availability of the environment. Why availability? Well, I will tell you later when I finish this post.


 


2. Analyzing the Data


 


In this case ISA Server Service was crashing with the following errors:


 



Event Type:       Error


Event Source:    Microsoft ISA Server Web Proxy


Event Category: None


Event ID:                          14197


Date:                01/10/2009


Time:                2:58:03 AM


User:                N/A


Computer:         MYISA


Description:


ISA Server was unable to write content to the cache file.


 


 



Event Type:       Error


Event Source:    Microsoft Firewall


Event Category: None


Event ID:                         14057


Date:                01/10/2009


Time:                2:52:37 AM


User:                N/A


Computer:         MYISA


Description:


The Firewall service stopped because an application filter module C:\Program Files\Microsoft ISA Server\w3filter.dll generated an exception code C0000005 in address 64754CD5 when function CompleteAsyncConnect was called. To resolve this error, remove recently installed application filters and restart the service.


 


The event 14057 is clear about one thing: this was an access violation exception (C0000005) on the filter module W3Filter.dll. Too broad, can be many things including issues with the filter itself, so we need to get a crash dump of this guy to better understand what is going on. Following the approach of one of my posts we can use DebugDiag to attach to wspsrv.exe and get the dump. After getting the dump you can use this other post as an example of how to analyze it. Unfortunately this is one of the cases where the public symbols don’t help that much as you can see below:


 



STACK_TEXT: 


WARNING: Frame IP not in any known module. Following frames may be wrong.


2b37fe10 6476e6df 27441f80 647717fe 275a5558 0x3a6169


2b37fe24 64778438 00000001 2bf579a0 64703de0 W3Filter!DllUnregisterServer+0x45ede


2b37fe90 0046d701 275a5558 00000000 00000040 W3Filter!DllUnregisterServer+0x4fc37


2b37fefc 0046e461 00000000 00000000 00000000 wspsrv+0x6d701


2b37ff20 0046e568 2bf57818 0046e3d7 2b37ff50 wspsrv+0x6e461


2b37ff30 0046d4ba 00000000 00000000 00000000 wspsrv+0x6e568


2b37ff50 00455fd7 2bf578bc 00000000 00000000 wspsrv+0x6d4ba


2b37ff7c 00456c8e 2bf578bc 00000000 00000000 wspsrv+0x55fd7


2b37ffb8 77e64829 00000015 00000000 00000000 wspsrv+0x56c8e


2b37ffec 00000000 00456b26 00000015 00000000 kernel32!GetModuleHandleA+0xdf


 


 


FAULTING_THREAD:  00001d88


 


DEFAULT_BUCKET_ID:  WRONG_SYMBOLS


 


PRIMARY_PROBLEM_CLASS:  SOFTWARE_NX_FAULT


 


BUGCHECK_STR:  APPLICATION_FAULT_SOFTWARE_NX_FAULT_BAD_INSTRUCTION_PTR_CODE_RUNNING_ON_STACK


 


FOLLOWUP_IP:


W3Filter!DllUnregisterServer+45ede


6476e6df 8b4624          mov     eax,dword ptr [esi+24h]


 


SYMBOL_STACK_INDEX:  1


 


SYMBOL_NAME:  W3Filter!DllUnregisterServer+45ede


 


FOLLOWUP_NAME:  MachineOwner


 


MODULE_NAME: W3Filter


 


IMAGE_NAME:  W3Filter.dll


 


STACK_COMMAND:  ~50s; .ecxr ; kb


 


BUCKET_ID:  WRONG_SYMBOLS


 


FAILURE_BUCKET_ID:  W3Filter.dll!base_address_c0000005_SOFTWARE_NX_FAULT


 


The !analyze result showed above will make you under the impression that W3Filter.dll is the culprit and it is exactly the opposite, this guy is only a victim.


 


3. Conclusion


 


After deeply analyze the dump using the private symbols we got to a conclusion that someone was locking the cache file when the Web Filtering was trying to write to it. Guess who was locking it? Once upon a time there was a system administrator that was following a plan that he received from his management to install backup software in all Windows Servers, so he installed this backup software on ISA and configured a Job to run every night…


 


The backup software was backing up the whole server (all hard drivers) including the driver where the ISA Cache was located. For this reason customer was saying that the issue just happened when the ISA Server Cache was enabled, if they disabled the cache the issue didn’t happen. Well make sense and the recommendation to exclude cache from backup as not new, as a matter of fact the article that recommends this is out there since October 2004, which is the following one:


 


Event ID 5, event ID 14079, and event ID 14176 are logged in the Application log on your Internet Security and Acceleration Server computer


http://support.microsoft.com/kb/887311


 


Now the answer for: Why Availability? Because the ISA Server service in this case was crashing due and addition of a new product in the ISA Box without testing it in a lab environment (where the change control procedure is?). The Windows OS maintenance was responsibility of the System Administrator that with all the good intentions configured the Backup Software to back it up the whole hard drive. However the Firewall Admin wasn’t aware of this addition since it was out of the scope of his duty (where the process awareness is?) and he swear since the begging that nothing change in the environment and ISA was crashing from nothing L.  But, this story had a happy end at least, so let’s finish this post with a smiling face J.


 

Comments (0)