Hey, Scripting Guy! How Can I Use Windows PowerShell to Replace Characters in a Text File?

ScriptingGuy1

Hey, Scripting Guy! Question

Hey, Scripting Guy! Using Windows PowerShell, how can I replace all the asterisks in a text file with some other character?

— RC

SpacerHey, Scripting Guy! AnswerScript Center

Hey, RC. You know, a lot of people ask the Scripting Guy who writes this column, “How do you do it? How do you manage to write a new column each and every day?” (Of course, lots of other people ask him why he writes a new column each day. But that’s another story.) “Every single day,” they’ll marvel. “Don’t you ever get too sick or too tired to write Hey, Scripting Guy!?”

Believe it or not, the answer to that is no, the Scripting Guy who writes this column never gets too sick or too tired to write Hey, Scripting Guy!; in fact, the Scripting Guy who writes this column is probably the healthiest person in the entire world. Is that due to a rigorous program of diet and exercise? Well, no, not really, not unless you count watching TV as exercise, and not unless doughnuts are now considered part of a healthy diet. Instead, the Scripting Guy who writes this column is healthy for one reason and one reason only: he’s overworked and overstressed.

It’s true: work-related stress is supposedly good for you. Researchers in Europe recently discovered that people are more likely to get sick when they are on vacation than when they go to work every day. The researchers theorized that this is because the stressors in the workplace trigger the body’s defense mechanisms, making it easier for you to ward off sickness and infections. When you’re at home you relax; in turn, your body lets down its guard, and – wham! – before you know it you’ve caught a cold or the flu. To be truly healthy, you need those workplace stressors.

And that’s good news for the Scripting Guy who writes this column. After all, if workplace stressors make you healthy, well, he’ll probably live to be 190 years old. At the very least, he’ll be around – and working – for a long time to come.

Which probably comes as a huge thrill to his old friend the Scripting Editor.

Note. Just think, Scripting Editor: we’ll be a team for many more decades to come! That might not sound like much fun, but just imagine how healthy that should make you.

Considering the fact that we’ve spent the morning trying to figure out how to write Perl scripts (in preparation for the upcoming 2008 Winter Scripting Games), we’re feeling especially … healthy … today. With that in mind, let’s see if we can figure out how to use Windows PowerShell to replace characters in a text file. For example, suppose we have the following text file (C:\Scripts\Test.txt):

This is line 1.*
This is line 2*.
*This is line 3.
This is * line 4.

Let’s further suppose that (for some reason) we want to replace those pesky asterisks (*) with at signs (@). How can we do that? Well, this script should do the trick:

(Get-Content C:\Scripts\Test.txt) | 
Foreach-Object {$_ -replace "\*", "@"} | 
Set-Content C:\Scripts\Test.txt

As you can see, there really isn’t much to this script; in fact, if we had slightly-wider Web pages we would have put the whole thing on a single line. We start out by using the Get-Content cmdlet to read in the text from the file C:\Scripts\Test.txt; by default, the text gets read in as an array, with each item in the array representing a single line in the text file. Oh, and notice that we enclosed the Get-Content command in parentheses. Why did we do that? Because that way we can be sure that PowerShell will read in the entire contents of the file before it does anything else.

So what happens after PowerShell has read in the entire contents of the text file? Well, our next step is to pipe those contents to the Foreach-Object cmdlet. As we pointed out a moment ago, when PowerShell reads in the contents of a text file it automatically turns that information into an array. What the Foreach-Object cmdlet will do now is loop through each and every item in that array; in other words, it will loop through each and every line in the text file. And for each of those lines Foreach-Object will execute the following scriptblock:

{$_ -replace "\*", "@"}

As you probably know, in a Windows PowerShell pipeline the $_ represents the current object. In this case, the first time we go through the loop $_ will represent the first line in the text file; the second time we go through the loop $_ will represent the second line in the text file; and so on. For each of these lines (that is, for each of these string values) we’re going to use the Replace method to replace all the asterisks in the line with an @ sign. To do that we simply specify the target character (an asterisk), followed by the replacement text (the @ sign).

The only tricky part here is that the asterisk is a reserved character in Windows PowerShell; because of that we need to “escape” the character before we can perform a search-and-replace operation using that character. How hard is that? Not hard at all; we just need to preface the asterisk (or any other reserved character) with a \:

"\*"

And yes, that is important. If you leave out the slash mark you’ll get an error message like this each time you run through the loop:

Invalid regular expression pattern: *.
At C:\scripts\test.ps1:2 char:33
+ Foreach-Object {$_ -replace "*",  <<<< "@"} |

Of course, you must also keep in mind that the only reason we need to escape the asterisk is because the asterisk is a reserved character in regular expressions. If you want to search for something that isn’t a reserved character then make sure you leave the \ off:

{$_ -replace "a", "@"}

So, in other words, sometimes we need a \ and sometimes we don’t? How are we supposed to know when we need to escape a character and when we don’t? You guys are making my head hurt.

Listen, don’t worry about it; remember, stress is good for you. (Which means that we probably should have said, “Go ahead and worry.” After all worrying is pretty stressful.) You don’t have to remember which characters are reserved characters; we’ll tell you which characters are reserved characters. We can’t say for sure that the following list (taken from MSDN) represents a complete list of reserved characters, but it’s a good place to start:

  • $
  • ()
  • *
  • +
  • .
  • []
  • ?
  • \
  • /
  • ^
  • {}
  • |

OK, back to the script. After Foreach-Object finishes off its search-and-replace operation, our virtual text file is then handed off to the Set-Content cmdlet; in turn, Set-Content writes the modified data back to the file C:\scripts\Test.txt:

Set-Content C:\Scripts\Test.txt

And that’s it; at that point we’re done.

You know, you’re right: that did seem a little too easy, didn’t it? Well, let’s take a peek at Test.txt and see what happened, if anything:

This is line 1.@
This is line 2@.
@This is line 3.
This is @ line 4.

Well, what do you know: it really was that easy, wasn’t it?

We hope that answers your question, RC; if it doesn’t, please let us know. In the meantime, the Scripting Guy who writes this column is in a bit of a quandary. He was going to go home early today, which sounded like way more fun than working. The only problem is this: each moment away from work increases the chances that this Scripting Guy will get sick. In fact, if he truly wants to live forever (and he does), he should stay late and work overtime each night. And on weekends. And holidays. And ….

Hmmm …. Maybe those researchers should recheck their calculations. After all, it’s quite possible that putting in long hours and skipping vacations don’t make you live forever; they just make it seem like you’ve lived forever.

But don’t worry about that, either: we’ll just go ask Peter Costantini, the oldest living Scripting Guy. Peter should know; after all, he pretty much has lived forever.

0 comments

Discussion is closed.

Feedback usabilla icon