Use PowerShell to Parse an XML File and Sort the Data

Summary: In this blog, the Scripting Wife learns how to use Windows PowerShell to parse her books XML files and to find authors and sort titles.

Microsoft Scripting Guy, Ed Wilson, is here. Today the registration opens for the 2012 Scripting Games. You can go to the 2012 Scripting Games site on PoshCode, click Log On, choose your authentication mechanism, and fill out your user name and email address. Make sure your user name is the name you want to appear on the leaderboard, and on your 2012 Scripting Games certificate. Also, make sure your email address is correct because it is used to notify you about prizes.

Yesterday, the Scripting Wife asked me to help her with exploring an XML file. Last night, she told me that she wants me to help her look at the XML file she got by exporting her book database so that she can find titles of books and things like that. I think I will call her, and see if she is ready…

“Scripting Wife?” I call in my nicest voice.

All of a sudden I look up, and she is here.

“You don’t have to yell. You could simply have called me on my Windows 7 phone,” she replied.

“I didn’t yell—and besides, I was not sure you had your Windows 7 phone with you,” I apologized.

“I always have it with me,” she said rather curtly. “So are you going to show me how to find titles in my XML file?”

“Yep. Why don’t you sit over here next to the computer, and first read your XML file into a variable. Remember to use the [XML] symbol to make sure that you get an XML document,” I said.

The Scripting Wife thought for a minute, and then typed the following command.

$books = Get-Content C:\fso\books.xml

“Now, remember that all your book data was in the book node. So why don’t you retrieve that, pipe it to the Foreach-Object cmdlet, and then retrieve the Title property,” I said.

The Scripting Wife thought for a few seconds and began to type. Within a minute or so, she had composed the command that is shown here. (The percentage sign (%) is an alias for the Foreach-Object cmdlet.)

$books.bookinfo.booklist.book | % {$_.mainsection.title}

The two commands and the output associated with the commands are shown in the image that follows.

Image of command output

“That is pretty cool, but I want to be able to find out how many books I have based on subject,” she asked.

“Well, that is pretty easy. Take your book element, pipe it to the Foreach-Object cmdlet, then pipe that to the Sort-Object cmdlet, and choose the DisplayName property. Then group it by DisplayName. Use the NoElement switched parameter to keep from displaying a bunch of extra information,” I said.

She thought for a second, and then used the Up arrow to retrieve her previous command. She then modified it until the following command appeared.

$books.bookinfo.booklist.book | % {$_.subjects.subject} | sort displayname | group displayname -NoElement

She turned the monitor to me, and the output that is shown in the image that follows was the result.

Image of command output

“That was pretty complicated. Can’t we do something easier? For instance, I want to find all the Perry Mason novels,” she requested.

“That one really is easy. Retrieve your command that lists all the titles. Because the titles are all strings, pipe that to the Where-Object and use the Match operator to look for Mason,” I said.

This time the Scripting Wife did not hesitate. She retrieved her next to last command by pressing the Up arrow twice. Then she added a pipe character, used the question mark (?) as an alias for the Where-Object. She added the Match operator and looked for the word Mason. Her completed command is shown here.

$books.bookinfo.booklist.book | % {$_.mainsection.title} | ? { $_ -match ‘mason’}

“This is more like it. Here take a look,” she said as she turned the monitor towards me.

Image of command output

“That is all I really wanted to see right now. I think I am going to head to the mall with a couple of my friends. You have been keeping me too busy around here, and I need a break,” she said as she jumped up and headed towards the door.

“Don’t forget…registration for the 2012 Scripting Games is open,” I said.

“I will register tomorrow,” she said.

And she was gone.

Join us tomorrow when I imagine the Scripting Wife will register for the 2012 Scripting Games.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy