Hey, Scripting Guy! How Can I Read Microsoft Excel Metadata?

ScriptingGuy1

Hey, Scripting Guy! Question

Hey, Scripting Guy! I have a VBScript script that reads Microsoft Office Excel metadata. I would like to update the script to Windows PowerShell, but can I do this? I tried using New-Object to create the Excel.Application object but somehow it does not seem to work. What gives? I thought Windows PowerShell could do everything VBScript can do and more. Am I wrong?

– SV

SpacerHey, Scripting Guy! Answer

Hi SV,

Wow! How many questions can you ask in one question? I am impressed! Let’s see if I can answer them all: yes, a lot, sort of, not really. I think that’s right. Let’s begin at the beginning. You want to update a previously working VBScript script to Windows PowerShell. I appreciate your enthusiasm, but if you have a perfectly fine VBScript script, there is really no need to “translate” it to Windows PowerShell. In fact, of the more than 3,000 VBScript scripts I have written, I have actually translated fewer than 100 of them to Windows PowerShell. Generally, I am so excited about the new capabilities of Windows PowerShell that I have not had time to do the translation work. This having been said, I do know of a few companies that have passed down the edict “no more VBScript,” and they actually assigned several of the network administrators to translate the existing VBScript scripts to Windows PowerShell. If you are interested in this process, let me know via scripter@microsoft.com (in English, if possible), and I will be glad to go into it in more detail.

Anyway, SV, I looked through the archives and found this script, which you may have been referring to:

On Error Resume Next
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = True
Set objWorkbook = objExcel.Workbooks.Open("C:\Scripts\Test.xls")
For Each strProperty in objWorkbook.BuiltInDocumentProperties
    Wscript.Echo strProperty.Name & " - " & strProperty.Value
Next

This script took Greg probably about an hour (or less) to write. The GetExcelMetaData.ps1 script that follows this paragraph took me two days to write, and in the end, I had to get help from my good friend and fellow Microsoftie Luis in Lisbon to solve a very perplexing problem with COM interop assemblies. So I am very glad you wrote and asked this question because I believe it will be instructive for us all. If you want to skip the details, here is the GetExcelMetaData.ps1 script. If you are a real geek (or geek at heart or geek wannabe), read on:

$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open("C:\test\excel.xls")
$binding = "System.Reflection.BindingFlags" -as [type]
Foreach($property in $workbook.BuiltInDocumentProperties)
{
  $pn = [System.__ComObject].invokemember("name",$binding::GetProperty,$null,$property,$null)
  trap [system.exception]
   {
     write-host -foreground blue "Value not found for $pn"
    continue
   }
  "$pn`: " +
   [System.__ComObject].invokemember("value",$binding::GetProperty,$null,$property,$null)
 }
$excel.quit()

So what is all this stuff about Excel metadata anyway? It sounds like something from some space epic:

“Oh, my! The rebels are attacking the metadata!”

“Launch the PowerShell!”

(Or a scenario to that effect.)

Anyway, in the image that follows this paragraph, you see some of the metadata associated with an Excel spreadsheet. There are actually two ways in Windows Vista to view metadata. When a file is selected in Explorer, some of the more important metadata properties are displayed in summary fashion along the bottom of the screen. A more complete listing is found when you right-click the file, choose properties, and then click the Details tab. As you can see, some of this information can be extremely useful. I love adding the comments field and the tags property as they can be used with the search feature in Windows Vista to allow me to easily find particular documents among the billions of files that hide on my 200 GB laptop drive:

Excel metadata graphic

 

Okay, SV, back to your question: Can I do this? Sure you can. The first thing we do in the GetExcelMetaData.ps1 script is the exact same thing you said you did: We create an instance of the Excel.Application object. In VBScript you use the CreateObject command, but in Windows PowerShell, we use the New-Object cmdlet as seen here:

$excel = New-Object -ComObject Excel.Application

So far so good. In the VBScript, Greg made the instance of Excel visible. That is cool, but is not needed for this particular script. After all we only want the metadata from the spreadsheet, not the actual data in the spreadsheet. To keep the spreadsheet invisible, we set the visible property of the Excel.Application object to false. This is seen here:

$excel.Visible = $false

(Okay, one more easy line of code, and then things are going to get hairy—bail out now if you wish.) The next thing we need to do is to open the Excel spreadsheet. To do this, we use the open method from the workbook object. The open method needs the path to the Microsoft Excel workbook on which you are planning on working. Here is the line of code that opens the workbook:

$workbook = $excel.Workbooks.Open("C:\test\excel.xls")

Now for the hard part (actually it is not too bad). The next thing we do is turn a string into a data type. The reason for doing this is twofold. First, I want to show you how to use this technique. Second, doing this will actually make the code a bit easier to read. What we are really doing here is creating an alias for a data type. The data type in question is the System.Reflection.BindingFlags class. If we just assigned the string “System.Reflection.BindingFlags” to the variable $binding, we would simply have a string and not an instance of the System.Reflection.BindingFlags class. The BindingFlags are used to control the binding of many classes in the System.Reflection and other Microsoft .NET Framework namespaces. We need to use this, so we can gain access to the InvokeMember method. You will see this later. To turn a string into a data type, use the -as operator and specify that it is -as [type]. This is seen here:

$binding = "System.Reflection.BindingFlags" -as [type]

Now we need to use the ForEach statement to walk through the collection of BuiltInDocumentProperties from the Excel Workbook object. We use the $property variable as our enumerator while walking through the collection. This is seen here:

Foreach($property in $workbook.BuiltInDocumentProperties)

Now for the two-day problem! I should be able to simply use the following code to obtain the name of each BuiltInDocumentProperties from the collection:

$property.name

The problem is the way the interop assembly for this particular COM object works. The object comes back and says it is a [System.__ComObject]. When you try to use $property.name, you get an error that says, “[System.__ComObject] does not have a property named name.” Here is where we need to use reflections:

$pn = [System.__ComObject].invokemember("name",$binding::GetProperty,$null,$property,$null)

The InvokeMember can be used in three different ways (called overloaded). Here is the breakdown of the method call:

Value Meaning

name

The name of the property to query.

$binding::GetProperty

The GetProperty static method from the System.Reflection.BindingFlags class we created earlier and stored in the $binding variable.

$null

The binder object that specifies a set of properties and enables binding. We use $null here to use the defaultBinder.

$property

The target object. Here is an instance of a document property from the BuiltInDocumentProperties collection.

$null

An array of objects to pass as arguments to the method. This is not needed here because we are querying a property.

If you query for an instance of a document property that does not exist, the object throws an error. This is true even if you are using VBScript. Because we went to all this trouble to query the document properties, we decided to go one step farther and show you how to trap errors with Windows PowerShell. In VBScript the original script used On Error Resume Next and move to the next property in the collection when an error occurred. We could have done exactly the same thing by using $ErrorActionPreference = “SilentlyContinue”. We decided it would be cool to use the Trap statement instead because this is the way errors are handled in C#. When using the Trap statement, you need to specify what kind of error to catch. Here we are catching a system.exception, which is pretty generic. When the exception is detected, we print out the name of the document property that does not exist in blue (much more friendly than red), and then we continue to the next property. This is seen here:

trap [system.exception]
   {
     write-host -foreground blue "Value not found for $pn"
    continue
   }

If the property does exist, we want to print out the name of the property and the value. This is seen here. Note we repeat the same command line that uses invokemember as seen earlier. This time, however, we also print out the value of the property and the name of the property, which is contained in the $pn variablevariable:

"$pn`: " +
   [System.__ComObject].invokemember("value",$binding::GetProperty,$null,$property,$null)

When we have the list of document properties and their values, we simply need to exit Excel (if you do not do this, it will continue to hang around and eat up memory). This line is about the same as in the VBScript script:

$excel.quit()

When we run the script, we are greeted with the following output:

Excel metadata graphic

 

So, SV, as you can see, not everything is necessarily easier to do in Windows PowerShell than in VBScript. As I said at the outset, if you have a script that works, leave it alone unless there is some compelling reason to revise it (such as official edicts). This also illustrates why I prefer to use .NET Framework classes when working with Windows PowerShell—in general, they will work easier and perform better.

Ed Wilson and Craig Liebendorfer, Scripting Guys

0 comments

Discussion is closed.

Feedback usabilla icon