Parsing Text Logs with Message Analyzer


One of Message Analyzer’s strengths is the ability to import multiple traces and text logs at the same time. This enables you to correlate message data by time and other factors, such as process ID or virtually anything you can think of with the use of Unions. This correlation capability is a great way to obtain insights from analyzing multiple related data sources, for instance a System Event Log (ETVX) and a Network trace file in .cap, .pcap, .pcapng, .etl, or other formats, as described in Locating Supported Input Data File Types from the Message Analyzer Operating Guide on Technet. This blog describes how Message Analyzer parses common text log formats with predefined OPN configuration definitions, and at a high level, how to create your own OPN definitions to parse custom text log formats that you may have. When you are ready, you can dive into the full reference for creating custom log file parser definitions by downloading the OPN Configuration Guide for Text Log Adapter document.

Using the Built-in Text Log Parser Definitions

Message Analyzer provides several built-in Text Log Configuration definitions for parsing common types of text logs. The list of parsers continues to grow as we create and publish new ones. The predefined text log parsing definitions are contained in the Device and Log File asset collection, which Microsoft can sync with your installation through the Message Analyzer feed on the Start Page. To always receive the latest updates to this collection, you must auto-sync the collection as described in Managing Item Collection Downloads and Updates.

Specifying a Text Log Configuration File

When you use the Quick Open feature or Windows Explorer to open a text log file that Message Analyzer cannot identify by its extension, you are presented with the dialog that follows, providing that you have not already specified a default configuration file in the Options dialog from the File menu:

clip_image002

You might recognize this as the dialog you use when configuring a Data Retrieval Session, which is accessible from the File menu by clicking New Session and selecting the Files submenu item. However, even when Message Analyzer attempts to open a .log file (supported) and a default configuration file is not specified, the indicated dialog opens in this case as well. In both cases, this behavior enables you to select the parser you want to use from the Text Log Configuration drop-down list before Message Analyzer loads the data.

The procedure in Load Saved Data with the Quick Open Feature is an example of how Message Analyzer treats .log files from which you load data with the Quick Open feature. A general procedure for loading data into Message Analyzer is specified in Load and Display Saved Data.

Keep in mind that you can specify one Text Log Configuration file to be the default parser for all text logs by selecting it in the global Options dialog. This is useful and expedient if you typically open a particular type of log on a regular basis.

More Information

To learn more about how to use the New Session dialog to retrieve data, see Retrieving Message Data.

To learn more about the supported input file types, see Locating Supported Input Data File Types.

To learn more about the predefined text log configuration files, see Parsing Log Files.

Extending Message Analyzer Parsing Capabilities

Text log parsing is extensible, like all other parsing with Message Analyzer and Open Protocol Notation (OPN). Keep in mind that parsing for CSV and TSV files are provided with Message Analyzer by default, so it is unnecessary to create parsers for those. However, if you want more depth and control, you can parse CSV file even further, by adding more data fields that are not typically broken out by commas or tabs. For example, a common problem is that timestamp formats vary widely, so you might want to fix the date-time format so that Message Analyzer can parse it properly like a timestamp. The LYNC.config parser in the following default location contains a good example of converting the LyncDateTime to the timestamp format that supports Message Analyzer functions, such as, sorting and time shifts:

%localappdata%\Microsoft\MessageAnalyzer\OpnAndConfiguration\TextLogConfiguration\DevicesAndLogs\ [fixed]

In addition to simple CSV/TSV parsing jobs, you have the ability to parse almost anything. This includes text log files that contain multiline entries with very loose formatting. Your success and subsequent performance always depends somewhat on the log file format that a particular developer chooses to use, but fortunately developers tend to be the logical, OCD types. Smile

Matching Data Fields with Regex

At a high level, you will need to parse your text logs by creating a Regex expression that maps to fields in the text log, which will display in the Details Tool Window after the file is parsed. For each unique entry in your log file, you will need to create a Regular Expression (Regex) that provides a case-sensitive string match to the fields in such an entry. These fields, you view in the Details tool window. After Message Analyzer parsers your log, you can use the filtering, charting, and grouping capabilities of Message Analyzer when analyzing the data, which provides an enormous amount of control compared to a normal text editor.

Learning Regex is no simple feat, although, if you like coding or solving puzzles, it can be a fun thing to do. Smile Note that Regex has some performance impact that you may want to consider as you write expressions.

At its simplest, Regex provides a way to match a pattern in a string. For instance “AB+C”, matches a pattern that can described as: A followed by one or more B’s, followed by a C, (ABBBC, ABBC, are matches but AC is not). But Regex is rich with different tricks and optimizations and the Regular Expression Language – Quick Reference is one page I have open when I’m creating expressions.

For Message Analyzer’s purpose, there’s a key piece of the Regex language we use called Grouping Constructs. In particular, we use the one that is described as “Captures the matched subexpression into a named group.” As an example of this, the ? character in the following pattern populates the specified field name (named group) with the data that the below subexpression matches:

“(?< FieldName > subexpression”)”

Let’s illustrate this by extending our example to “A2015B”. Instead of B’s in the middle, we’ll try to capture any string of digits, such as a year:

A(?<Year>\d+)B

The A and B are just anchor points in this example, rather than field names. The \d is a character class that stands for a decimal digit. There are also other useful character classes for Hex, ASCII, and others, as described in the previously indicated Regex Quick Reference.

The + sign is a quantifier which in our example means to match 1 or more decimal digits. That way we can accept any decimal values of any length, such as 2, 20, and 20156789. There are many other ways to describe a match, but this is among the most basic examples.

Another great site that I have open while developing in Regex is Derk Slager’s .NET regular Expression Tester. You can quickly enter A2015B as the Source and the Regex we created above as the Pattern and see the results more dynamically:

clip_image004

The results of this example indicates that there is one group, however, I could have entered multiple lines in the Source resulting in multiple groups in the results. The result is 2015(Year), because <Year> is the field name that was specified in our pattern.

Building a .config File to Parse a Text Log

So now that you have a very basic Regex expression, let’s build a parser that takes this A20015B string as input. The next step is to define a message that uses this Regex expression. With Message Analyzer, everything is a message, which is a basic construct of OPN. For text log messages, you must inherit from a Base message type called LogEntry, although you can have any number of derived classes in your inheritance chain. You must also have an EntryInfo aspect, which enables you to define the Regex statement, Priority field, and so on; but for now, only the Regex parameter will be used.

message ExampleMessage
    with EntryInfo { Regex = @“A(?<Year>\d+)B” }
    : LogEntry
{
    int Year;
}

.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }

Thereafter, you can save the above code in a file named Test.config in the path below, or to any directory underneath:

%localappdata%\Microsoft\MessageAnalyzer\OpnAndConfiguration\TextLogConfiguration\

You can also save the string “A2015B” in a log file named Test.log and place it in a suitable location that you can navigate to through the Add Files feature in the New Session dialog for a Data Retrieval Session. Message Analyzer will build the parser definition on the next restart. Then, when you open the Test.log file through Add Files, you will notice a new item named Test in the Text Log Configuration drop-down list below the toolbar on the Files tab in the New Session dialog, which is based on the file named Test.config that you just created.

Putting it All Together

Now that you have the basics under your belt, you can review the OPN Configuration Guide for Text Log Adapter document for an even richer description of all the options. This document contains an example .config and .log file with which you can play, along with lots of more details. Note, your knowledge of Regex can help you write Message Analyzer filters (see the RegEx section in the Operating Guide). As always, if you have questions, the Message Analyzer forums are a great resource to get more help.


Comments (1)

  1. Anonymous says:

    Filtering is a key skill in Message Analyzer and the corner stone of removing noise and finding the data

Skip to main content