Expert Commentary: 2012 Scripting Games Advanced Event 6

Summary: Microsoft senior software engineer on the Windows PowerShell team, Lee Holmes, provides expert commentary for 2012 Scripting Games Advanced Event 6.

Microsoft Scripting Guy, Ed Wilson, is here. Lee Holmes is the expert commentator for Advanced Event 6.

Photo of Lee Holmes

Lee is a senior software engineer on the Microsoft Windows PowerShell team, and he has been an authoritative source of information about Windows PowerShell since its earliest betas. He is the author of the Windows PowerShell Cookbook, Windows PowerShell Pocket Reference, and the Windows PowerShell Quick Reference. 
 
Blog: Precision Computing 
Twitter: http://www.twitter.com/Lee_Holmes 
LinkedIn: http://www.linkedin.com/pub/lee-holmes/1/709/383

The script for the 2012 Scripting Games Advanced Event 6 is pretty descriptive, so rather than go over it again line-by-line, I thought it’d be helpful to talk about two of the main ideas that went into creating the script: jobs and streaming.

Networking is slow, parallel jobs are not

When running a large network-bound operation (such as retrieving the Win32_OperatingSystem class from a long list of computers), your computer spends the vast majority of its time waiting on the network: waiting for the connection, waiting for the computer to respond, and waiting for the data to get back.

To address this problem, Windows PowerShell 2.0 introduced the concept of “jobs”–primarily in remoting, WMI, and eventing. When you assign a multimachine task to a job, Windows PowerShell distributes your commands among many worker threads, which it then runs in parallel. Each worker thread processes one computer at a time. As each child job completes (for example, a remote WMI query against a specific computer), Windows PowerShell feeds that worker thread a command for another computer. By default, Windows PowerShell launches 32 child jobs in parallel.

What’s most amazing about Windows PowerShell jobs running 32 tasks in parallel is not that it makes things 32 times faster. It’s that it makes them even faster than that! The reason is a branch of computer science called “queueing theory.” Here’s a summary for the busy admin…

As a thought experiment, consider running a query against 92 computers one-by-one. Also, imagine that your first and second computers are rebooting, and they take a minute before failing to return their response. The other 90 return their responses in two seconds each.

The total time for that query is five minutes, for an average of about 3.3 seconds per successful computer. This is the same problem that happens when a bunch of hungry shoppers get stuck behind the crazy person paying with 100 coupons at the grocery store: every delay impacts everybody in line.

Now, consider the impact of parallel jobs in Windows PowerShell. Those first two rebooting computers use up two of our available worker threads for an entire minute, but we still have 30 more to process the remaining 90 computers. The total time for that query is about six seconds, giving an average time of about 0.07 seconds per successful computer. That’s 50 times faster than processing one computer at a time! Surprising, but incredibly cool. In the grocery store, this is like waiting in a single line, but having 32 cashiers serving from it.

Streaming is important

Given that this is likely to be a long-running script, you’re going to want to keep your results as dynamic as possible.

As the first step toward that, this script makes good use of the –Verbose and –Progress streams. As the script processes computers, it emits a progress message telling you which one it’s working on. If you specify the –Verbose parameter, you get even more detail—specifically, the uptime information as it writes it to the CSV.

This approach solves a common problem that I see: scripts that force their verbose or debugging information on the end user. The ultimate example of this sin is aggressive use of the Write-Host cmdlet. When you run this kind of script, you get reams and reams of text on screen, with no way to silence it. You can’t tell good information from bad, and other scripts can’t use it without this internal debugging information spewing all over the screen.

In addition to streaming progress, the script also streams its output. Although it would be easiest to collect all of the job output into a variable and then dump it to the CSV, all of your data is lost if you ever cancel the script while it is executing. When your script streams its output, you can easily monitor the results as they are received with this simple command:

Get-Content 20120409_Uptime.csv -Wait

If your query takes an hour to complete, it’s nice to be able to check its progress before then.

Jobs and streaming—two useful techniques to maximize the efficiency of long-running tasks. Now, here is my solution for Advanced Event 6:

##############################################################################

##

## Get-DistributedUptime

##

##############################################################################

 

<#

 

.SYNOPSIS

 

Retrieves the uptime information (as of 8:00 AM local time) for the list of

computers defined in the $computers variable. Output is stored in a

date-stamped CSV file in the “My Documents” folder, with a name ending in

“_Uptime.csv”.

 

.EXAMPLE

 

Get-DistributedUptime

 

#>

 

param(

    ## Overwrites the output file, if it exists

    [Parameter()]

    [Switch] $Force

)

 

## Set up common configuration options and constants

$reportStart = Get-Date -Hour 8 -Minute 0 -Second 0

$outputPath = Join-Path ([Environment]::GetFolderPath(“MyDocuments”)) `

    (“{0:yyyyddMM}_Uptime.csv” -f $reportStart)

 

## See if the file exists. If it does (and the user has not specified -Force),

## then exit because the script has already been run today.

if(Test-Path $outputPath)

{

    if(-not $Force)

    {

        Write-Verbose “$outputPath already exists. Exiting”

        return

    }

    else

    {

        Remove-Item $outputPath

    }

}

 

## Get the list of computers. If desired, this list could be ready from

## a test file as well:

## $computers = Get-Content computers.txt

$computers = “EDLT1″,”EDLT2″,”EDLT3″,”EDLT4”

 

## Start the job to process all of the computers. This makes 32

## connections at a time, by default.

$j = Get-WmiObject Win32_OperatingSystem -ComputerName $computers -AsJob

 

## While the job is running, process its output

do

{

    ## Wait for some output, then retrieve the new output

    $output = @(Wait-Job $j | Receive-Job)

 

    foreach($result in $output)

    {

        ## We got a result, start processing it

        Write-Progress -Activity “Processing” -Status $result.PSComputerName

 

        ## Convert the DMTF date to a .NET Date

        $lastbootupTime = $result.ConvertToDateTime($result.LastBootUpTime)

 

        ## Subtract the time the report run started. If the system

        ## booted after the report started, ignore that for today.

        $uptimeUntilReportStart = $reportStart – $lastbootupTime

        if($uptimeUntilReportStart -lt 0)

        {

            $uptimeUntilReportStart = New-TimeSpan

        }

 

        ## Generate the output object that we’re about to put

        ## into the CSV. Add a call to Select-Object at the end

        ## so that we can ensure the order.

        $outputObject = New-Object PSObject -Property @{

            ComputerName = $result.PSComputerName;

            Days = $uptimeUntilReportStart.Days;

            Hours = $uptimeUntilReportStart.Hours;

            Minutes = $uptimeUntilReportStart.Minutes;

            Seconds = $uptimeUntilReportStart.Seconds;

            Date = “{0:M/dd/yyyy}” -f $reportStart

        } | Select ComputerName, Days, Hours, Minutes, Seconds, Date

 

        Write-Verbose $outputObject

 

        ## Append it to the CSV. If the CSV doesn’t exist, create it and

        ## PowerShell will create the header as well.

        if(-not (Test-Path $outputPath))

        {

            $outputObject | Export-Csv $outputPath -NoTypeInformation

        }

        else

        {

            ## Otherwise, just append the data to the file. Lines

            ## zero and one that we are skipping are the header

            ## and type information.

            ($outputObject | ConvertTo-Csv)[2]  >> $outputPath

        }

    }

} while($output)

~Lee

2012 Scripting Games Guest Commentator Week Part 2 will continue tomorrow when we will present the scenario for Event 7.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy