Slow Code: Top 5 Ways to Make Your PowerShell Scripts Run Faster

Article
07/12/2017

Slow code?

Are you frustrated by slow PowerShell scripts? Is it impacting business processes? Need help tuning your scripts? Today's post is for you.

Can you identify with any of these customer slow PowerShell stories?

Case #1

Customer is scanning Active Directory Domain Controllers in multiple domains and forests scattered all over the state on slow links. This key audit data takes 62 hours to collect and impacts the business response to audit requests. After applying these techniques, the customer reports that the script now completes in 30 minutes.

Case #2

Customer needs to update user licensing on Office 365 for daily new user provisioning. This impacts the business due to 10 hour run time of the script. After applying these optimization tips, the script finishes in 14 minutes.

Case #3

Customer is parsing SCCM client log files that rotate every few minutes. But the script takes longer to run than the log refresh interval. After applying these techniques, the script is 10% of its original length and runs 10 times faster.

Scripting Secrets

After several years of teaching and advising PowerShell scripting I have observed some routine practices that lead to poor script performance. Often this happens with people who copy and paste their scripts from internet sources without truly understanding the language. Other times it simply comes from a lack of formal training. Regardless, today I am going to share with you the secrets I have shared with many customers to improve their script run time.

The classic programming trade-off is speed vs. memory. We want to be aware of both as we write the most efficient code.

Problem #0: Not using cmdlet parameter filters

There is an ancient PowerShell pipeline proverb: Filter left, format right. Disobey it, and your script will take a while. It means that you should filter the pipeline objects as far to the left as possible. And formatting cmdlets should always go at the end, never the middle.

Early on a customer reported to me, "Querying event logs over the WAN is taking days!" Study these two code samples below. Which do you think is faster and why?

 # Approach #1
Get-WinEvent -LogName System -ComputerName Server1 |
  Where-Object {$_.InstanceID -eq 1500}

# Approach #2
Get-WinEvent -FilterHashtable @{LogName='System';ID=1500} `
  -MaxEvents 50 -ComputerName Server1

The first approach retrieves the ENTIRE event log over the wire and then filters the results in local memory. The second approach uses cmdlet parameters to effectively reduce the dataset coming from the remote system.

This same advice applies to any cmdlet that queries data, whether local or remote. Be sure to explore the help for all the parameters, even if they look complicated at first. It is worth your time to write the code correctly.

Yes, #2 is faster. Much faster.

Problem #1: Expensive operations repeated

Usually I see this manifest as a query to Active Directory, a database, Office 365 accounts, etc. The script needs to process multiple sets of data, so the script author performs a unique query each time. For example, I need to report on the license status of 10 user accounts in Office 365. Which pseudo code would be faster?

 For Each User
    Query the account from Office 365
    Output the license data of the user

Or this:

 Construct a single query to Office 365 that retrieves all users in scope
Execute the query and store it into a variable
Pipe the variable into the appropriate loop, selection or output cmdlet

Yes, the second is more efficient, because it only performs the expensive operation once. It may be a little more involved to construct the query appropriately. Or you may need to retrieve an even larger data set if you cannot precisely isolate the accounts in question. However, the end result is a single expensive operation instead of multiples.

Another expensive operation is crawling an array to search a value:

 For ($i=0; $i -lt $array.count; $i++) {
    If ($array[$i] -eq $entry) {
        "We found $entry after $($i+1) iterations."
        $found = $true
        Break
    }
}

Instead, add the items to a hash table which has blazingly fast search performance:

 $hashtable.ContainsKey($entry)

See Get-Help about_Hash_Tables for more information on my favorite PowerShell data structure.

Problem #2 & #3: Appending stuff

Append-icitus is painful, but appending to objects is more painful. This usually comes in one of two forms:

Appending to files
Appending to arrays

Appending to files

I usually see this with script logging output. Cmdlets like Add-Content, Out-File -Append and Export-CSV -Append are convenient to use for small files. However, if you are using these in a loop with hundreds or thousands of iterations, they will slow your script significantly. Each time you use one of these it will:

Open the file
Scroll to the end
Add the content
Close the file

That is heavy. Instead use a .NET object like this:

 $sw = New-Object System.IO.StreamWriter "c:\temp\output.txt"
for ($a=1; $a -le 10000; $a++)
{
    $sw.WriteLine($BigString)
}
$sw.Close()

For CSV output, this may require you to construct your own CSV delimited line of text to add to the file. However, it is still significantly faster.

Appending to arrays

I used to do this one often until someone pointed it out to me.

 # Empty array
$MyReport = @()
ForEach ($Item in $Items) {
    # Fancy script processing here
    # Append to the array
    $MyReport += $Item | Select-Object Property1, Property2, Property3
}
# Output the entire array at once
$MyReport | Export-CSV -Path C:\Temp\myreport.csv

Now this is almost one better, because we are not appending to a file inside the loop. However, we are appending to an array, which is an expensive memory operation. Behind the scenes .NET is duplicating the entire array in memory, adding the new item, and deleting the old copy in memory.

Here is the more efficient way to do the same thing:

 $MyReport = ForEach ($Item in $Items) {
    # Fancy script processing here
    $Item | Select-Object Property1, Property2, Property3
}
# Output the entire array at once
$MyReport | Export-CSV -Path C:\Temp\myreport.csv

You can actually assign the variable one time in memory by capturing all the output of the loop. Just make sure the loop only outputs the raw data you want in the report.

Another option is to use a hash table or .NET array list object. These data structures can dynamically add and remove items without the memory swapping of an array. See Get-Help about_Hash_Tables or System.Collections.ArrayList.

Problem #4: Searching text

The log parsing example I mentioned in Case #3 above gets a lot of people, especially if you started scripting in VBScript where string methods were quite common. Here is a quick chart comparing the three most popular text parsing methods, including links for more info.

Technique	Friendly	Power
String methods	Yes	No
Regular expressions	No	Yes
Convert-String / ConvertFrom-String	Yes	Yes

Sometimes string methods (ToUpper, IndexOf, Substring, etc.) are all you need. But if the text parsing requires pattern matching of any kind, then you really need one of the other methods, which are much faster as well.

Here is a simple example of using string methods:

 $a = 'I love PowerShell!'
# View the string methods
$a | Get-Member -MemberType Methods
# Try the string methods
$a.ToLower()
$a.ToLower().Substring(7,10)
$a.Substring(0,$a.IndexOf('P'))

While string methods are easy to discover and use, their capability gets cumbersome very quickly.

Observe this comparison of three techniques:

 $domainuser = 'contoso\alice'

# String methods
$domain = $domainuser.Substring(0,$domainuser.IndexOf('\'))
$user   = $domainuser.Substring($domainuser.IndexOf('\')+1)

# RegEx
$domainuser -match '(?<domain>.*)\\(?<user>.*)'
$Matches

# Convert-String
'contoso\alice' |
    Convert-String -Example 'domain\user=domain,user' |
    ConvertFrom-Csv -Header 'Domain','User'

RegEx is used widely in PowerShell: -split, -replace, Select-String, etc. RegEx excels at parsing string patterns out of text with speed. Take some time to learn it today (Get-Help about_Regular_Expressions).

The new Convert-String and ConvertFrom-String cmdlets were introduced in PowerShell 5. See the links in the chart above for more detailed examples of these powerful text parsing cmdlets. ConvertFrom-String excels at parsing multiple values out of multi-line patterns. And that is exactly what challenged the customer in Case #3 above.

How can I tell how long my script runs?

Use one of these techniques to test different versions of your code for speed.

PowerShell has a cmdlet Measure-Command that takes an -Expression scriptblock parameter. This is the first way most people measure execution time.

 Measure-Command -Expression {
    #
    # Insert body of script here
    #
}

Others will do something like this:

 $Start = Get-Date
#
# Insert body of script here
#
$End = Get-Date
# Show the result
New-Timespan -Start $Start -End $End

Either method returns a TimeSpan object with properties for any desired unit of time. Just be sure to use the total properties for accuracy.

The One Common Performance Link: Loops

If you do a slow operation one time, maybe that is little impact. But if you do it 1,000 times, then we are all in trouble. If the data processed in each loop is a rich object with many properties, then it is even worse (ie. more memory). Review your loops carefully to identify expensive commands and optimize them.

Disclaimer

One of the challenges of sharing code publicly is that I am always learning. If you go back to my posts six years ago, you will find that I used some of these poor practices. I have re-written and re-blogged some of them. Others are still there.

Take-aways:

Keep learning
Review (and optimize) all code you find online before implementing it
Periodically review your most-used scripts in light of your new knowledge

Keep scripting!

More Tips

You can find more tips in The Big Book of PowerShell Gotchas over at PowerShell.org/ebooks.