Yes, We Recycle!…That Includes AppPools


And I am not just talking about Aluminum and Polystyrene.   Application Pool recycling is one great feature of IIS6 but how do you know if it is helping you or hurting you?  It’s great for availability and reliability but it can also be masking real problems with code running in your applications. If you are recycling too quickly based on memory limits you may have a memory leak that should be looked at, if you are recycling due to ping fails or other errors, you may have a more serious issue to look at. 


 


  We monitor our IIS worker process uptime to measure general health of our applications and make capacity planning decisions (how many apps per app pool, etc).  Process startup and .Net framework initialization does have a performance cost,  If you are recycling too frequently, this can affect your web servers throughput.  We’ve had our share of misbehaving applications that have recycled as frequently as every 5 minutes or just flat out crashed the worker process.   You can set Application pools to recycle based on time, number of requests, virtual and private memory limits.  Application pools may also recycle due to health reasons, Ping Fail, Fatal Communications Error, etc., or there’s always the admin recycle or the config change recycle.  We have many applications (sometimes hundreds) spread out across several app pools.  We use our WP uptime data to check if we have a misbehaving application out there (maybe someone published some bad code) or make decisions on how many app pools we can run, which applications to put into separate app pools and how many applications an App Pool should have.


 


How do you verify and monitor that is really happening?


 


Well, the first thing you have to do is log all recycle events before you can start collecting and analyzing the data.  To enable recycle event logging, if you haven’t already:


 


cscript adsutil.vbs Set w3svc/AppPools/LogEventOnRecycle 255


 


This enables all recycle events for all AppPools to be logged in the system event log.


 


More information  on Application Pool recycling events and logging is available in the following KB article:


 


332088  How to modify Application Pool Recycling events in IIS 6.0


http://support.microsoft.com/default.aspx?scid=kb;EN-US;332088


 


(note: there may be an error in the above article, I use event ID 1117 for private memory)


 


Once you have logging enabled you can start collecting this info and creating baselines for your application pools.  One of our favorite tools, Log Parser, is great for collecting this data. 


 


For more information on Log Parser:


http://www.microsoft.com/technet/scriptcenter/tools/logparser/default.mspx


 


 


You can use a simple log parser query to get the recycle events:


 


Logparser “Select top 100 to_string(TimeGenerated, ‘MM/dd/yyyy hh:mm:ss’) as dateTime from \\SERVERNAME\System where SourceName = ‘W3SVC’ and EventID in (1009;1010;1011;1074;1077;1078;1079;1080;1117) and Message like ‘%DefaultAppPool%'”


 


This query will give all the recycle and process failure events for the application pool named like defaultapppool. I’ve included a few other events in this query that aren’t necessarily recycle events but worker process failure events that do affect the uptime of your worker process.


 


Here is a list of events that I am interested in:


1009 = Worker process Terminated UnExpectedly


1010 = Worker process failed a Ping


1011 = Worker process suffered a Fatal Comummincations Error


1074 = Recycled based on the defined Time Limit


1077 = Recycled based on the Virtual Mem Limit


1078 = An ISAPI reported unhealthy


1079 = An Admin requested a recycle


1080 = A Config Change required a recycle


1117 = Recycled based on Memory limit


 


That’s a good start but I want to see why we recycled as well, so this query is a little better:


 


Select top 100 to_string(TimeGenerated, ‘MM/dd/yyyy hh:mm:ss’) as dateTime,


case


            EventID


            when 1009 then ‘UnExpEnd’


            when 1010 then ‘PingFail’


            when 1011 then ‘FatalComErr’


            when 1074 then ‘TimeLimit’        


            when 1077 then ‘VMem’


            when 1078 then ‘ISAPIUnHealth’


            when 1079 then ‘Admin’            


            when 1080 then ‘ConfigChange’


            when 1117 then ‘PMem’


end as Reason


 


     from \\SERVERNAME\System


            where SourceName in (‘W3SVC’;’WAS’) and


        EventID in (1117;1080;1079;1078;1077;1074;1011;1010;1009) and


            Message like ‘%defaultapppool%’ and


            TimeGenerated > to_timestamp(’01-01-2006 00:01:01′,’yyyy-MM-dd hh:mm:ss’)


 


Save this as wpuptime.sql and run “logparser file:wpuptime.sql” and this will give you an ouput like:


 


dateTime            Reason


——————- ————


09/23/2006 01:18:51 ConfigChange


09/23/2006 01:19:52 VMem


09/23/2006 01:20:04 ConfigChange


09/28/2006 09:28:03 Admin


10/04/2006 18:24:16 ConfigChange


10/05/2006 07:50:02 PingFail


10/05/2006 08:21:26 PingFail


 


That’s better, I can see event date/time and reason but I still want more info, like what is the average up time for the app pool that I am interested in?  What was the shortest recycle time, what was the longest, what are the count of each events?  To go further with this, I turn to vbscript and invoke the log parser COM object.   To use the log parser COM object you must register the DLL on the machine where you will run the script. 


 


Regsvr32 logparser.dll


 


Here’s an example script that uses the above query and calculates the info, it doesn’t have much error checking and output isn’t exactly pretty, but I did say it was an example right?


 


‘ WPUPTIME.VBS – Queries event logs for W3SVC Recycle events (last 100) and calculates up time stats between events


‘ Requires LogParser.dll to be registered


‘ 1009 = TermUnExp


‘ 1010 = PingFail


‘ 1011 = FatalComErr


‘ 1074 = TimeLimit


‘ 1077 = VMem


‘ 1078 = ISAPIUnhealthy


‘ 1079 = Admin


‘ 1080 = ConfigChange


‘ 1117 = PMem



Dim cConfChg, cAdmin, cISAPI, cVMem, cPMem, cTimeLimit, cFatalComErr, cPingFail, cTermUnExp


Dim myQuery


Dim myInputFormat


Dim MinDate


Dim recordSet


Dim lastD, curD, oldD, firstD


Dim uptime_avg, uptime_mins, uptime_mins_last


Dim reccount


Dim timetotal


Dim MinTime


Dim MaxTime


Dim strReason, strReasonLast


Dim strComputer


Dim strAppPool


Dim szQuery


 


WScript.Echo


 


ParseCommandLine()


 


MinDate = CStr(“01-01-2006 00:01:01”)


 


Set myQuery = CreateObject(“MSUtil.LogQuery”)


Set myInputFormat = CreateObject(“MSUtil.LogQuery.EventLogInputFormat”)


myInputFormat.direction = “BW”


 


Call QueryServer(strComputer, strAppPool)


           


CalcUpTime()


                                               


Wscript.Echo strComputer, “,” , strAppPool, “Avg Uptime:”, uptime_avg, “minutes”


WScript.Echo


WScript.Echo “Count of Recycle Events”


WScript.Echo “VirtualMem:” + Chr(9)+  CStr(cVMem)


WScript.Echo “PrivateMem:” + Chr(9) +  CStr(cPMem)


WScript.Echo “TimeLimit:” + Chr(9) + CStr(cTimeLimit)


WScript.Echo “PingFail:” + Chr(9) + CStr(cPingFail)


WScript.Echo “Admin:” + Chr(9)+ Chr(9) + CStr(cAdmin)


WScript.Echo “FatalComErr”  + Chr(9) + CStr(cFatalComErr)


WScript.Echo “TermUnExp” + Chr(9) + CStr(cTermUnExp)


WScript.Echo “ConfChg” + Chr(9)+ Chr(9) + CStr(cConfChg)


WScript.Echo “ISAPI” + Chr(9) + Chr(9) + CStr(cISAPI)


WScript.Echo


 


Set myInputFormat = Nothing


Set myQuery = Nothing


set Locator = Nothing


 


WScript.Quit(0)


 


Sub CalcUpTime()


 


WScript.Echo(“RecycleTime        Uptime(mins) Reason”)


 


reccount = 0


timetotal = 0


MinTime = 1000


MaxTime = 0


cConfChg = 0


cAdmin = 0


cISAPI = 0


cVMem = 0


cPMem = 0


cTimeLimit = 0


cFatalComErr = 0


cPingFail = 0


cTermUnExp = 0


 


Do While recordSet.atEnd() <> True


           


            Set record = recordSet.getRecord()


            curD = CDate(record.GetValue(0))


            strReason= CStr(record.GetValue(1))


           


            If reccount <> 0 then    


                        uptime_mins = DateDiff(“n”, curD, lastD)


                        WScript.Echo lastD,      , uptime_mins, strReasonLast


                        timetotal = timetotal + uptime_mins


           


                        if MinTime > uptime_mins Then


                                                MinTime = uptime_mins


                        End if


                       


                        if MaxTime < uptime_mins Then


                                                MaxTime = uptime_mins


                        End if


                       


            Else


                        firstD = CurD


            End IF


            Select Case strReason


                        Case “VMem”                cVMem = cVMem + 1


                        Case “PMem”                cPMem = cPMem + 1


                        Case “Admin”    cAdmin = cAdmin + 1


                        Case “ISAPIUnHealth” cISAPI = cISAPI + 1


                        Case “TimeLimit” cTimeLimit = cTimeLimit + 1


                        Case “PingFail”  cPingFail = cPingFail + 1


                        Case “FatalComErr”  cFatalComErr = cFatalComErr + 1


                        Case “UnExpEnd”  cTermUnExp = cTermUnExp + 1


                        Case “ConfigChange”                cConfChg = cConfChg + 1


            End Select


           


            lastD=curD


            reccount = reccount + 1


            strReasonLast = strReason


            uptime_mins_last = uptime_mins


           


            recordSet.moveNext()


 


            If Err.number <> 0 Then


                        WScript.Echo “Error in CalcUpTime: “, Err.Description


                        Err.Clear


                        Exit Do


            End If


           


Loop


 


                       


oldD = lastD


uptime_avg = Round(timetotal / (reccount-1))


 


 


WScript.Echo “”


WScript.Echo “Time Span: “ + CStr(oldD) + ” through “ + CStr(firstD )


 


WScript.Echo “MinTime:”, MinTime


WScript.Echo “MaxTime:”, MaxTime


WScript.Echo “”


 


 


End Sub


 


Sub QueryServer(tmpComp, tmpAppPool)


 


szQuery = “Select top 100 to_string(TimeGenerated, ‘MM/dd/yyyy hh:mm:ss’) as dateTime, case EventID when 1077 then ‘VMem’ when 1117 then ‘PMem’ when 1079 then ‘Admin’ when 1078 then ‘ISAPIUnHealth’ when 1074 then ‘TimeLimit’ when 1010 then ‘PingFail’ when 1011 then ‘FatalComErr’ when 1009 then ‘UnExpEnd’ when 1080 then ‘ConfigChange’ end as Reason” + _


     ” from \\” + tmpComp + “\System where SourceName in (‘W3SVC’;’WAS’) and EventID in (1117;1080;1079;1078;1077;1074;1011;1010;1009) and Message like ‘%”+tmpAppPool+ _


            “%’ and TimeGenerated > to_timestamp(‘”+MinDate+“‘,’yyyy-MM-dd hh:mm:ss’)”


 


WScript.Echo “Querying”, tmpComp, tmpAppPool, ” events”


WScript.Echo


 


Set recordSet = myQuery.Execute(szQuery, myInputFormat)


 


If Err.number <> 0 then


            WScript.Echo “Could not execute query in QueryServer: “, Err.Description


End if


 


End Sub


 


Sub ParseCommandLine()


            Dim vArgs


 


            set vArgs = WScript.Arguments


 


            if vArgs.Count <> 2 then


                        DisplayUsage()


            Else


                        strComputer = vArgs(0)


                        strAppPool = vArgs(1)


            End if


End Sub


 


Sub DisplayUsage()


 WScript.Echo “Usage:  cscript.exe “ & WScript.ScriptName & ” ServerName AppPoolName”  & vbLF & vbLF & _


     “Example: “ & vbLF & _


      WScript.ScriptName & ” MyServer DefaultAppPool” & vbLF


      WScript.Echo


 WScript.Quit(0)


End Sub


 


The above script takes two parameters to run – the servername and the app pool name to match.  The query uses a LIKE comparison, so make sure your app pool name given is unique enough to get correct data.  Here is the sample outoput (abbreviated):


 


RecycleTime        Uptime(mins) Reason


10/18/2006 1:49:40 PM       304 PMem


10/18/2006 8:45:56 AM       535 PMem


10/17/2006 11:50:04 PM       592 PMem


10/17/2006 1:58:28 PM        842 PMem


10/16/2006 11:56:41 PM       14973 Admin



… (cut for brevity)



9/27/2006 9:06:14 AM         14 VMem


9/27/2006 8:52:02 AM         96 VMem


 


Time Span: 9/27/2006 7:16:08 AM through 10/18/2006 1:49:40 PM


MinTime: 1


MaxTime: 14973


 


MyServer , MyAppPool Avg Uptime: 309 minutes


 


Count of Recycle Events


VirtualMem:     81


PrivateMem:     6


TimeLimit:      0


PingFail:       11


Admin:          1


FatalComErr     0


TermUnExp       0


ConfChg         1


ISAPI           0


 


It prints out the date/time of the event, minutes elapsed since the previous event and the reason.  Followed by the time span of the events collected (first/last event used).  It gives you minimum, maximum and average uptime of the application pool and a count of the various events the query found. 


 


If you see you are recycling based on your time limits, you probably have a pretty healthy app.  Looking at the above output we can see the majority of the recycling is due to memory related limits with a few ping fails.  This app pool may be have a memory leak, may be using ASP.Net cache excessively or could just have too much content to be contained in a single app pool process (ASPX content is compiled into DLL’s that consume virtual memory), more investigation may be required.

Comments (0)