Using a Generic Text Log rule to monitor an ASCII text file – even when the file is a UNC path


There are several examples in blogs on how to create a generic text log rule to monitor for a local text file (Unicode, ASCII, or UTF8).

This will be a step-by-step example of doing the same, however, using this to monitor the log file on a remote UNC path instead of a local drive.  This is useful when we want to monitor a file/files on a NAS or an a share that is hosted by a computer without an agent.

This is a bit unique… instead of applying this rule to ALL systems that might have a specific logfile present in a specific directly – we are going to target this rule to only ONE agent.  This agent will monitor the remote fileshare similar to the concept of a “Watcher Node” for a synthetic transaction.  Therefore we will be creating this rule disabled, and enabling it only for our “Watcher”.

 

In the Ops console – select the Authoring pane > Rules. 

Right click Rules, and select Create a new rule.  We will chose the Generic Text Log for this example:

 

image

 

Choose the appropriate MP to save this new customer rule to, and click Next.

For this rule name – I will be using “Company Name – Monitor remote logfile rule”

Set the Rule Category to “Alert”

For the target – I like to use “Windows Server Operating System” for generic rules and monitors.

UNCHECK the box for “Rule is enabled”

 

image

 

Click Next.

 

The directory will be the UNC path.  Mine is “\\VS2\Software\Temp”

The pattern will be the logfile(s) you want to monitor.  We can use a specific file, such as “logfile.log” or a wildcard, such as “*.log”.

You should not check the “UTF8” box unless you know the logfile to be UTF8 encoded.

 

image

 

Click Next.

On the event expression, click Insert for a new line.  Essentially – log file monitors look at each new line in a logfile as one object to read, and this is represented by “Params/Param[1]”  This “Parameter 1” is the entire line in the logfile, and is the only value that is valid for this type of monitor – so just type/paste that in the box for Parameter Name.

Since we want to search the logfile line for a specific word, the Operator will be “Contains”.

For the value – this can be the word you are looking for in the line, that you want to alert on.  For my example, I will use the word “failed”.

 

image

 

Click Next.

 

On the alert screen – we can customize the alert name if desired, set the severity and priority, and build a better Alert Description.  If you are using SP1 – the default alert description is blank.  If you are using R2 – the default alert description is “Event Description: $Data/EventDescription$”  HOWEVER – this is an invalid event variable for this type of event (logfile)…. so we need to change that right away.  I keep a list of common alert description strings HERE

For this – I will recommend the following alert description.  Feel free to customize to make good sense out of your alert:

Logfile Directory : $Data/EventData/DataItem/LogFileDirectory$
Logfile name: $Data/EventData/DataItem/LogFileName$
String:  $Data/EventData/DataItem/Params/Param[1]$

Click “Create” to create the rule.

Find the rule you just created in the console – right click it and choose “Properties”.  On the Configuration tab, under responses (to the right of “Alert”) click Edit.

Click the “Alert Suppression” button.  You should consider adding in alert suppression on specific fields of an alert – in order to suppress a single alert for each match in the logfile.  If you don't – should the monitored logfile ever get flooded with lines containing “failed” from the application writing the log – SCOM will generate one alert for each line written to the log.  This has the potential to flood the SCOM database/Console with alerts.  By setting alert suppression here – we will create one alert, and increment the repeat count for each subsequent line/alert.  I am going to suppress on LoggingComputer and Parameter 1 for this example:

 

image

 

Click OK several times to accept and save these changes to the rule.

 

Now – we created this rule as disabled – so we need to enable it via an override.  I will find the rule in the console – and override the rule “For a specific object of class:  Windows Server Operating System”

 

image

 

Now – pick one of these machines to be the “watcher” for the logfile in the remote share. 

**Note – the default agent action account will make the connection to the share and read the file.  In my case – the default agent action account is “Local System” so this will be the domain computer account of the “Watcher” agent which connects to the remote share and reads the file.  This account will need access to the share, folder, and files monitored.  Keep that in mind.

Set the override to “Enabled = True” and click OK.

 

At this point, our Watcher machine will download the management pack again with the newly created override, and apply the new config.  Once that is complete – it will begin monitoring this file.  You can create a log file in the share path, and then write a new line with the word “failed” in it.  You need a carriage return after writing the line for SCOM to pick up on the change.

You should see a new alert pop in the console, based on matching the criteria.  Subsequent log file matches will only increment the repeat count.  Customize the alert suppression as it makes sense for you.

Then – create additional rules just like this – for different UNC paths.

 

image

Comments (43)

  1. Kevin Holman says:

    "One comment on Alert Supression though.

    Does not Parameter 1 equal Params/Param[1] (the entire line)?

    If so, for alert supression to work, the line would have to be exactly the same. A date or timestamp will prevent supression to happen."

    ——————-

    Yes – Parameter 1 equals that – therefore – my example would supress anytime the line that matched was identical.  Typically – this is correct.  If the line isnt identical – then it will be a different alert.  If that is not desirable – then remove Param 1.

    1. Pradeep Teotia says:

      Dear Kevin i need your help in some other issue related to network discovery. Network monitoring discovered multiple entries of same device ip. Example it discovers 3 entries of IP 172.29.55.23. so how we can restrict this.

  2. Kevin Holman says:

    Please post more details from the events you are getting – I dont know what those event ID’s are.

  3. dasboot says:

    Is there a way you could set the monitor to alert after a specified number of the text values are detected? Or for a specified number of minutes? For example, a text log that has occurances of "ORA=720" on lines 89, 95, and 105. But, will not alert until the third occurance of the text value on line 105. Or, two occurances of "ORA=720" within a 5 minute period?

  4. Kevin Holman says:

    No – in my example – I used an agent using Local System…. which is an "Authenticated User" and therefore had access to this share.  THis specific share had share permissions of Everyone-FullControl, and NTFS permissions of Everyone-Read.

    If your share or NTFS permissions are more strict – then make sure you grant the computer account of the agent access to both share and NTFS, or run the agent under a domain user account, which has access to the share/NTFS.

  5. Kevin Holman says:

    Great questions.  However – you are missing the boat a bit in a few areas.

    This is probably ready for a blog post on its own – not a blog response.

    Basically – I will hit the high points:

    1.  Dont ever target Windows Computer.  That is very old and very bad advice.  Windows Computer is a bad target for monitoring.  For monitoring – target your workflows to the most specific existing target that meats your needs.  Such as:  Exchange 2003 Role, or DNS 2003 Server, or SQL DB engine, etc…..  

    2.  If you cant find a good existing class to target – then target a generic class with the workflow disabled, then enable it via overrides, for a custom group.  A good generic class for generic targeting is Windows Server Operating System.  That is my generic class of choice.  This is a good solution – but DOES have the problem of a laundry list of disabled monitors showing up in Health Explorer.

    3.  Create a new class for your targets.  You perceive this is difficult.  It isnt – it is super simple and IS THE RIGHT way to target.  It does require a bit more understanding of the product – but is easy.  See:  http://technet.microsoft.com/en-us/magazine/2008.11.targeting.aspx?pr=blog

  6. Kevin Holman says:

    Yes and No.

    A rule/monitor workflow MUST target a class.  Period.  End of story.

    However – we have two options here:

    1.  Target a generic class, like Windows Operating System – then disabled, then override as enabled for my one specific object.  This is the example I used above.

    2.  Create a new class, using WMI/Registry provider for example, and make only the one special computer I want to be a discovered instance of that class… then target that class (much more complicated)

  7. Anonymous says:

    <<<<3.  Create a new class for your targets.  You perceive this is difficult.  It isnt – it is super simple and IS THE RIGHT way to target.  It does require a bit more understanding of the product – but is easy.>>>>

    The article quoted is very good, but I’ll need to read it a few more times to make complete sense of it.

    If there is to be a blog about this, can I request a Part 2….If part 1 is Discovery, can part 2 be Reporting?

    Thx,

    John Bradshaw

  8. Rogerm says:

    Excellent example, thanks!

    One comment on Alert Supression though.

    Does not Parameter 1 equal Params/Param[1] (the entire line)?

    If so, for alert supression to work, the line would have to be exactly the same. A date or timestamp will prevent supression to happen.

    Regards

    Roger

  9. Lee Nicoll says:

    Nice work. Quick question. Does the SCOM Agent on th Watcher node need to be running under a Network account?

  10. Desmond says:

    Nice work. Just curious. Is it possible to set the Rule Target to specific machine/server rather than a class of machines?

  11. Richard says:

    Excellent post, although I am having one strange problem, the log reader works 100% and the alerts are correct, but the system seems to randomly alert off the same line in the log file over and over again, what could I have done wrong to get this happening?

  12. babu says:

    i got error for the path "C:SummitCfAdapterPRODSummitCfAdapter-LOH-PRODlogSummitCfAdapterMaster.log" Error opening log file directory Event ID’s 31705,31707

    Please help

  13. babu says:

    i got error for the path "C:SummitCfAdapterPRODSummitCfAdapter-LOH-PRODlogSummitCfAdapterMaster.log" Error opening log file directory Event ID’s 31705,31707

    Error description

    "Error opening log file directory

    Directory =

    C:SummitCfAdapterPRODSummitCfAdapter-LOH-PRODlog

    Error: 0x80070003

    Details: The system cannot find the path specified."

    but when I change path to "C:SummitCfAdapterPROD" it works fine.

    is it because of "-" in the file path.

    I’hv tried enclosing path in the double and single quote also but the got error "The filename, directory name, or volume label syntax is incorrect."

  14. babu says:

    thanks kevin for your quick reply. Dont know how but it is working now. not changed anything just restarted the health service and it is working.

  15. VRKumar says:

    Thanks Kevin for the excellent example.

    i have a problem as i fallowed excatly the steps you mentioned but it is not giving me any output.

    the path i tried 2 ways

    \localhostd$product

    &

    d:product

    file name is company.log

    is their a way to find if the rule is working or not?????

    -VRKumar

  16. tom says:

    Thanks for this example.

    I have create same rule and alert appears only when create log file don’t when log file change ? It is Normal ?

    If Yes, is there a way to display an alert if a log file changes ?

    Thanks you for response.

  17. ML49448 says:

    Kevin,

    I am using this rule in several situations, but lately I have run into a situation where this rule is not effective for log files that are recreated everyday. The previous day’s file is renamed, and the application creates a new and empty log file.

    So if SCOM were to detect a string on line 100 one day, and then several days later, with a recreated file of the same name and path, the string appears on line 90, SCOM will not alert. It appears that this monitoring solution only works for files that grow, and not for log files that are created anew everyday by the application on the server.

    Until I understand how to use a rule that avoids this limitation, I am going to be parsing the log file with a script and creating a counter file to track the number of error string detections that appear on a daily basis.

    I am interested to know if you have encountered this before and welcome any suggestion you have.

    Mike

  18. Phil says:

    I have been trying to find an example to collect events from a w3svc log file, in MOM 2005 you would select the IIS Application log provider but I can’t find the equivilant in SCOM.

    I am aware that SCOM has the ability via the system.ApplicationLog.InternetLogEntryData library data type to read w3svc files, but I’m not sure how to create a rule using this.

    Any suggestions appreciated.

    Phil

  19. bret says:

    I have issues with SCOM event log and text log monitoring and was wondering if anyone had any input as I haven’t seen this addressed anywhere.

    Basically:

    Setting up monitoring as described above (and everywhere else on the net) seems to go against the "purpose" of SCOM monitoring and I’m wondering if all the trouble to set it up this way will cause issues down the road when one is monitoring hundreds of logs (and/or many log entries).  Let me explain….

    Most posts recommend this:

    Create an event log or text log monitor, target it to "Windows Computer",disable it, then enable it against a group or computer instance you want the monitoring to execute against.  Simple and effective….

    But:

    AS soon as you load the pack in SCOM with the disabled monitor in it, the monitor shows up under "Windows Computer" on all servers (server classes)in SCOM’s health models. This is because you’ve targeted it against "Windows Computer".  If you watch a lot of application logs and event logs you have a lot of monitors which will show up as white circles on all health models under the computer class. In addition, you end up with many configuration overrides to enable the monitoring on xyz server or group that has to be maintained as you add/remove systems (using groups is better, but you still have to maintain a group of *some configuration*)…

    If you look how Microsoft designs their packs, you load the pack and the pack *discovers* the application and components it was designed to monitor on the servers that the component exists on.  THEN the log monitors kick in as xyz component exists on a box and abc log monitor specifically looks for the logs/strings that component is going to generate.  Done this way the health model is "clean" across all servers and there is no "maintenance" having to track all the places you’ve enabled/disabled the individual monitors.

    In short:

    SCOM is designed to discover application components, THEN watch logs (etc) once it discivers the component as it’s driven by health models.

    My issue is:

    All tutorials don’t address this and instead use the methods in the above post.  I understand this is just to "get it done".  I am looking for any input as to if anyone knows of any detrimental effects of cluttering up the health models with (possibly hundreds of) "empty monitors" by using the "quick method" of log monitoring vs the "long and painful method" of discovering application components that all vendor packs use.

    For a single "disabled" montior that was enabled ona single server to watch a log

    I could see the agent/RMS/database needing to track all the "empty/not enabled" monitors created. Besides having to tell people "All those empty circles…well, it’s that way on purpose" it seems there might be a performance or storage hit keeping up with all the "junk" in the models.

    Question:

    Is there any issues with this or am I just making up problems that don’t need to be solved?  I’ve found a few ways around this that are really ugly and don’t want to commit to using them as no one else seems to worry about this.

  20. Andrew says:

    i don’t allow the disable and override process, we do script based discovery using the filesystem object, i believe authormp.com had a tutorial on this complete with examples

  21. bret says:

    Thanks for the info (Link in #3 above-> checking it out right now)! Does having "a laundry list of disabled monitors showing up in Health Explorer" cause any problems anyone knows about (or, in theory could cause issues if taken to the extreme)?  I ask this because some vendor packs we’ve loaded do this…

  22. andyinsdca says:

    I’ve noticed something interesting about this log monitor: If there isn’t a carriage return at the end of the line, the monitor won’t pick it up. Using our sample monitor above, the log file has these 2 entries at the bottom:

    My disk has failed(CR)

    Another disk has failed (no CR)

    The first line will get picked up but the second one doesn’t. Most log writers will put a CR at the end, but I was doing manual testing (putting stuff in with notepad) and noticed this "bug"

  23. Wei Hao says:

    I created a couple of these monitors but am receiving: Application Log Module Failed Execution warning alerts:

    FindNextChangeNotification failed Directory ….. Error Code = 0x38 One or more workflows were affected by this. Workflow name …

    Will the monitor work in this case ? How to I address this warning alert ?

  24. ming says:

    Hi,

    This post is really useful. However, i am planning to use SCOM to monitor any changes in our DHCP leased file. Meaning i have to constantly log the IPs that are leased out and when a new IP is being leased out, i will log the new IP. Any idea how can i do that?

    This rule basically alert on a parameter being detected, so basically, i need to change the alerting from detecting a paramenter to detecting a change in the parameter and logging it. Thanks.

  25. Rick says:

    Has a solution been found for using Params/Param[1] for alert suppression when the string included a timestamp at the beginning?

  26. Adhokd says:

    I'm trying to monitor a log file which is on a NAS share. The scom action account has read only permissions on the log file but the agent is not able to read that file as it runs on local system. Shouldn't the "run as" option work over here when the SCOM infrastructure has been configured to make use of scom action account as a "run as" account for all the SCOM agents?

    Is there any way that the log file can be monitored with restricted permissions? Do i need to use the set action account utlity on the agent?

    Adhok

  27. Adhokd says:

    I'm trying to monitor a log file which is on a NAS share. The scom action account has read only permissions on the log file but the agent is not able to read that file as it runs on local system. Shouldn't the "run as" option work over here when the SCOM infrastructure has been configured to make use of scom action account as a "run as" account for all the SCOM agents?

    Is there any way that the log file can be monitored with restricted permissions? Do i need to use the set action account utlity on the agent?

    Adhok

  28. Murad says:

    Kevin,

    Can SCOM log-file monitor, monitor a text with a “variable” in it? For example – I need to monitor the following text “Global User 'xxxx' created successfully from Active Dir. Account” (where “xxxx” will be the newly created AD user account) in the name.log file can this be accomplished?

    thx

  29. Rafal says:

    Hi Kevin,

    I'm one of your biggest fans since I started my adventure with SCOM 2 years ago. Have you ever think to write a book? Ideally in CHM format already 🙂

    Yesterday I was on MS workshops regarding SCOM, they mentioned your blog and your name almost all the time 😀 – it means something!

    Question:

    It’s great another job as always with this monitor. I managed easy my log monitoring with word ERROR base on your example, but I'm getting only one line in alert description. There is a chance to get more lines in alert description? Let say 10 more? Sometimes one is not enough.

    Thanks for your great work. Keep going!

  30. satish says:

    Sorry for a long Post, But I am not sure what I am doing wrong. Looking for an assistance from Kevin on this.

    Log File Monitoring in SCOM:

     I created a test Rule targeted against windows Server Operating System to monitor \Test-SCOM-DV3Logfiles folder.

     This rule monitors all the Txt files (*.TXT) in the above said folder. And whenever SCOM detects the word “FATAL” a critical alert will be triggered

     This rule is disabled by default and enabled only against the “Watcher Node” (System that does remote logfile monitoring)

    Scenarios I tested:

     Created a Logfile with name ErrorLog.Txt and inserted a phrase “Testing FATAL Error”

     For the first line/word (FATAL) SCOM doesn’t alert, from the second line/FATAL error SCOM start alerting.

     Without closing the first alert in SCOM console and if you insert the same phrase third time “Testing FATAL Error” the repeat count increases to 1.  

     If you insert a new phrase like “Fatal Error Testing” a new alert will be generated. Till here things works as expected.

    First Scenario:

     When we rename the logfile from ErrorLog.Txt to ErrorLog.TXT20120118 and create a new logfile with the name ErrorLog.TXT

     Insert the same Phrase “Testing FATAL Error” twice we get the new alert with repeat count of 3 (The monitor is looking in all the log files though the name of the first logfile is changed) we get the repeat count increased for all the similar phrases.  This should not work this way.

    Second Scenario:

     Modified the Rule to monitor specific logfile by changing the pattern from *.TXT to ErrorLog.TXT

     Insert the word/Phrase “Testing FATAL Error” you get the alerts normally.

     Once you rename the text file from ErrorLog.Txt to ErrorLog.TXT20120118 and create a new file with the same name ErrorLog.Txt and insert the word/Phrase as “Testing FATAL Error”, we stop getting alerts.  When I diagnosed this, we see an information event on watcher node stating the monitored logfile disappeared from the specified location.

  31. Vijayh says:

    Hello Kevin,

    I got below error in event viewer

    Error opening log file directory

    Directory =

    “D:Program Files (x86)Quest SoftwareQCVDSR6.0.3confsMRO-389logs"

    Error: 0x8007007b

    Details: The filename, directory name, or volume label syntax is incorrect.

    One or more workflows were affected by this.  

    Workflow name: UIGeneratedMonitor642fcc2492734d1bbcf373a7b64785f1

    Instance name: Microsoft Windows Server 2008 R2 Enterprise  

    Instance ID: {18469874-BBD2-A085-0744-9EC5DC7B2D5A}

  32. vijayh says:

    Another information, my log file name would be operation_dumper.log-yyyymmdd.log….so in pattern i gave as operation_dumper.log.*..i am getting the same 31705 error in event viewer.

  33. sureshb says:

    My requirement for log file monitoring:

    Log file location: D:Program files (x86)company namelogs

    Log file name: verigy_name.log.yyyymmdd.log

    I've created log file monitor with directory in double quotes due to space in program files "D:Program files (x86)company namelogs"

    Pattern: verigy_name.log.*.log

    created overrides for specific servers, but i didn't receive any alert and got the error in event viewer with event id: 31705..error opening log file directory, the file name, volume label name syntax is incorrect.

    Please help me to fix this.

  34. shahid says:

    I have followed your instructions and created a rule but if I select ConfigMgr Pri Site Server as Target the rule doesn’t work can you please help me to make it work.

  35. Hi,

    anyone give information about params/param[1] ?

  36. NHALD says:

    Hello Kevin,

    I just would like to know if you have even try to monitor text log files on different server locations?

    Ex. The requirement is to monitor the path D:SampleTestlog.txt.
    This path is present on 100+ servers and should be monitored by an MP. And I know i’ts very illogical to create 100+ rules just to point to 100+ different server locations.

    Hope you could help me. Thanks in advance!

  37. Anonymous says:

    In this post I’ll talk about IIS log file monitoring. Log file monitoring and event collections rules

  38. Odge says:

    The timestamping in some log files is nuisance. It means the Rule continues to alert on old matches, as individual alerts, until the log is cleared, which isn’t always practical.

    Also, all the time your matched entry exists, the repeat count will just continue rising until, again, the line is removed from the log file.

    Anyway round this without using a monitor (which I believe remembers where it was in the log, the last time it triggered)? The monitor version is ok, but requires you to manually reset the monitor to continue, and if, in that time, more matches have occurred,
    it will change state for those immediately after until you clear past them by multiple Health resets.

  39. Snigdha says:

    Is it possible to modify the log file rule with respect to time ,so that unnecessary alert could be minimised.

  40. Lyndog says:

    Not quite the right forum so I apologize but this is driving me insane. Have set up logfile monitoring for a very simple log written by powershell which monitors the number of new SCOM alerts. The logfile seems to be working fine but the monitor not so. Have tried both as a CSV and a simple log file but whatever I do the monitor always reports as healthy. I’ve even tried reversing the criteria conditions but still always healthy.
    If the line contains Number,0 it is healthy. Any other number is unhealthy.
    Is there any logging you can use that shows how it is evaluating the condition?

  41. surabhi says:

    Hi Kevin,

    Is there any way of disabling alerting for a particular type of error logs and for rest of the error logs we get alerts.

  42. surabhi says:

    Hi Kevin,
    we need to monitor error logs for a particular log file and we need to exclude certain error logs from getting monitor. How can we exclude those logs from getting monitored?

Skip to main content