Dealing with Too Much Data

Networks are fast. Way fast! Sniffing for just a few seconds can generate a lot of data. And it’s not trending down at all. While we will continue to optimize things and make Message Analyzer work with larger data sets more effectively, the truth is that we can’t keep up with the pace. There needs to be another strategy to work with huge data sets.

Browse, Select, View

One feature we have developed to help you tackle this problem is Browse/Select/View. The idea is that you start by browsing the data at a high level, and then narrowing down what you load with a time range and selection filter. In fact, you can still visit this option by clicking on the Configuration option in the ribbon after a large trace has started loading.

The thinking is that you can often eliminate some traffic you are not interested in and reduce your memory foot print. Often with large data sets, you know the time frame where a problem might have occurred or you know that a certain port traffic is more interesting than others. So why even bother loading this into memory if it’s only going to eat up memory and affect performance.

Importing Your Data

For the example below, I’ve already opened a capture file and pressed the Configuration button to return to the Browse tab. But you can also start your session new by navigating to the Browse Tab first.

Also, keep in mind that if you have a session open already, either by navigating to browse or Quick Open of a .cap like I have above, you can create a new session by click on the New Browse Session button.

Once you have a session open, you can add more traces which can include log files, captures, ETWs, etc. If we can, we’ll scan the files for start/stop end times and message counts. Then update the time line control and total messages. There are some limitations with gathering this information because not all files can be scanned. However you can always manually type in a start and stop time.

Narrowing Things Down

Once you have identified all the sources you are interested in looking at, you can now select the data you want to focus on. You can enable start/stop time and then slide the time line to focus on a specific time period, or type in a time manually. Given you could have days of traffic, this could drastically affect your memory foot print.

You can also apply a selection filter. For instance, maybe you are only interested in HTTP traffic. Now this does require that we parse the data, which means it could take a while. But once we do, the resulting foot print in memory will be much smaller.

Going Back to the Well

Now after looking at your data, you might decide that you need to focus on a different section or change the selection filter. You can return to the Browse page or select the “Configuration” button from the ribbon. You should notice this Info Bar.

It’s letting you know that if you add files, we’ll have to reprocess those new files. But you can Undo Changes at any time.

If you decide to change the time line or the selection filter, the Info Bar updates to tell you that we’ll have to reprocess all the data again.

Slaying Large Data

With these tools to handle large data sets, we hope to provide a better experience in terms of performance and usability. We plan to continue this idea with hopes to enable other mechanisms to quickly index the information so that you can scan things at a high level without having to incur the expense of fully parsing the data.

More Information

To learn more about the Browse-Select-View (BSV) model, how to configure a Browse Session to locate and import input data, and how to create a Time Filter and/or Selection Filter to limit the amount of data you retrieve, see the following topics in the Message Analyzer Operating Guide on TechNet:

Comments (5)

  1. Paul E Long says:

    That depends if the ETL files in question are manifest based.  The best way to test in general is just to try loading the data where Lync is installed and see.  If the resulting message modules show as ETL, then that means they are WPP or MOF based which we can't parse.  

    For Lync, my guess is they are WPP based, which means you'd need private symbols anyways.  Based on a some searches, they all say send the fields to Microsoft, which is what leads me to my conclusion.

  2. Paul E Long says:

    We have an API, which you could do the thing you are thinking of. However, we have not put any energy yet into making the API public. We did try to carefully layer the API, so that the UI you see is built on top. So in the future this should be possible.

  3. darwin says:

    Hi Paul

    Thanks for This,

    i am very excited with the launch of Message Analyzer and the new possibilities

    One question:

    Lync client does generate some ETL files if logging option is enabled, is it possible to open these files to get some diagnostic information using Message Analyzer?

  4. Anonymous says:

    When learning a new program, it’s often helpful to have a high level view of the various pieces and parts

  5. Bob Mazzo says:

    Is it possible to integrate the data into our own website ? Say import your analyzer files into a DB table or some other format, then display it within an admin page on our website.
    Or would there be an API in order to consume data for display in our own website ?
    thank you.

Skip to main content