How Can I Find All the Files in a Directory Tree that Contain a Specific Word or Phrase?

ScriptingGuy1

Hey, Scripting Guy! Question

Hey, Scripting Guy! How can I find all the files in a directory tree that contain a specified phrase?

— PB

SpacerHey, Scripting Guy! AnswerScript Center

Hey, PB. OK, so picture this: for the first time in a couple of months, the Scripting Guy who writes this column and the Scripting Son are playing basketball. The Scripting Son has a comfortable lead, but the Scripting Guy who writes this column makes a desperate comeback, sinking four three-pointers in a row and tying the score. With the gym about to close for the night, everything has come down to this: next basket wins.

As it turns out, the Scripting Guy who writes this column has the ball. He dribbles down to the baseline and then, for some stupid reason, he stops dribbling. (If you’re not a basketball fan, picking up your dribble while down on the baseline isn’t a particularly good thing to do.) Thanks to his own stupidity, the Scripting Guy who writes this column is now trapped: he has the baseline on one side of him and the Scripting Son, who’s a good 7 inches taller than his father, on the other side. Thinking quickly (or at least as quickly as a Scripting Guy can think) he pivots one way, pivots back the other way, jumps up in the air, and then flips the ball, underhanded and left-handed, beneath the Scripting Son’s outstretched arm. The ball rattles off the backboard and drops through the hoop. Game over!

“That was so lucky,” said the Scripting Son.

Luck?!? Obviously the Scripting Son has no idea what luck is. Making a basket in basketball isn’t lucky. No, lucky would be someone writing in to ask how he or she could locate all the files in a directory tree that contain a specified phrase, and then just happening to have a script lying around that does that very thing.

You know, a script like this one:

On Error Resume Next

Set objConnection = CreateObject(“ADODB.Connection”) Set objRecordSet = CreateObject(“ADODB.Recordset”)

objConnection.Open “Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;”

objRecordSet.Open “SELECT System.ItemPathDisplay FROM SYSTEMINDEX WHERE Contains(‘Alice’) ” & _ “AND System.ItemPathDisplay LIKE ‘C:\Scripts\%'”, objConnection

objRecordSet.MoveFirst

Do Until objRecordset.EOF Wscript.Echo objRecordset.Fields.Item(“System.ItemPathDisplay”) objRecordset.MoveNext Loop

Before we go any further we should note that this script uses Windows Desktop Search 3.0, a relatively new, fully-scriptable search technology that is included in Windows Vista (and can be downloaded and installed on other versions of Windows). We opted to use this approach for two reasons:

It’s much easier to access all the folders in a folder tree using Desktop Search than to use WMI or the FileSystemObject. Performing this task using WMI or the FileSystemObject would require writing some crazy recursive subroutine that can get at each and every folder in a folder tree. Despite our reputation, the Scripting Guys don’t like to do crazy things. (Not that we don’t actually do them, mind you. But we don’t like to do them.)

Windows Desktop Search can open – and search through – all sorts of file types, including Word documents, Excel spreadsheets, PowerPoint presentations, etc. Without using Desktop Search we’d be limited to searching through text files. (OK, in theory we could search other document types, but we’d have to include code that identifies, opens and searches Word documents plus code that identifies, opens and searches Excel spreadsheets plus code that identifies, opens and searches – well, you get the idea.)

Anyway, while there might be other ways we could perform this task, we decided that the easiest way to perform this task was to use Desktop Search. And, for the Scripting Guys at least, the easiest way is almost always the best way.

So how exactly do we use Desktop Search to perform this task? Well, we start out by creating a pair of ADO (ActiveX Data Objects) objects; we’ll use the ADODB.Connection object to connect to the Desktop Search file index, and we’ll use the ADODB.Recordset object as a storehouse for any information returned by our query. After creating the two objects we then use the Connection object’s Open method to bind us to Windows Desktop Search:

objConnection.Open “Provider=Search.CollatorDSO;Extended Properties=’Application=Windows’;”

Note. What all is going on in that line of code? Well, to be honest, we don’t really know for sure. Nor do we really care. That’s because this is simply boilerplate code: use this code exactly as-is any time you want to use Desktop Search and everyone will live happily ever after.

After we open the data store we come to the heart-and-soul of any search script: the part where we specify exactly what it is we’re searching for. In this case, we’re calling the Recordset object’s Open method, passing the method two parameters: the SQL query that contains our search criteria and an object reference to the Connection object (objConnection):

objRecordSet.Open “SELECT System.ItemPathDisplay FROM SYSTEMINDEX WHERE Contains(‘Alice’) ” & _
    “AND System.ItemPathDisplay LIKE ‘C:\Scripts\%'”, objConnection

As far as we’re concerned, the one part of this command that truly matters is the SQL query itself:

“SELECT System.ItemPathDisplay FROM SYSTEMINDEX WHERE Contains(‘Alice’) ” & _
    “AND System.ItemPathDisplay LIKE ‘C:\Scripts\%'”

Let’s see if we can figure out what’s going on here. For our purposes today, the only thing we care about is the file path to any file that contains the word Alice. (By the way, this search is not case-sensitive; the search will locate Alice, alice, ALICE, or any other variation of the word.) When working with Desktop Search you need to specify each and every property value you want returned. Because the property System.ItemPathDisplay contains the file path we start our query out like this:

SELECT System.ItemPathDisplay

The next part of the query (FROM SYSTEMINDEX) is also boilerplate code; Desktop Search only has a single “table” (SYSTEMINDEX) that we can query. That brings us to our WHERE clause, a clause that features a pair of criteria that must both be met for a file to be returned:

The file must include the word Alice. That’s what the syntax Contains(‘Alice‘) is for: it tells the script to read the contents of the file and see if the word Alice can be found anywhere in those contents. (OK, technically Desktop Search has already read through the file; the script doesn’t need to do so. But you get the idea.)

The file must be located somewhere in the C:\Scripts folder tree. The syntax System.ItemPathDisplay LIKE ‘C:\Scripts\%’ states that the file path (System.ItemPathDisplay) must begin with C:\Scripts\. The LIKE operator enables us to use a wildcard in our query; that wildcard happens to be the percent sign (%), which serves the same purpose as the asterisk does in, say, the dir command (e.g., dir C:\Scripts\Test.*). In other words, the percent sign causes this clause to be read as follows: “Show me any file where the path starts with C:\Scripts\, regardless of what (if anything) comes after C:\Scripts\.” In turn, thus enables us to locate files found in the C:\Scripts folder (e.g., C:\Scripts\Test.txt) as well as files found in a subfolder of C:\Scripts (e.g., C:\Scripts\Subfolder1\Test.txt).

Much easier than writing a recursive subroutine that accomplishes the same seemingly-trivial feat of accessing all the files in a folder tree.

After executing the query we use the MoveFirst method to move to the first record in the returned recordset. (We’re not 100% sure that this is necessary, but better safe than sorry, right?) We then set up a Do Until loop that runs until we reach the end of the recordset; that is, until the recordset’s EOF (and-of-file) property is True:

Do Until objRecordset.EOF

Inside this loop we don’t do much; we simply echo back the value of the System.ItemPathDisplay property, then call the MoveNext method to move on and repeat the process with the next record in the recordset:

Wscript.Echo objRecordset.Fields.Item(“System.ItemPathDisplay”)
objRecordset.MoveNext

When all is said and done, we should get back a report similar to this:

C:\Scripts\Scores.xls
C:\Scripts\Alice2.txt
C:\Scripts\Decoded.txt
C:\Scripts\Decrypt.txt
C:\Scripts\New Folder\New Folder\New Text Document.txt

Notice two things here. For one, the first file in the list is actually an Excel spreadsheet file; as we noted earlier, Desktop Search has the ability to search inside Microsoft Office documents. (Among other things, Desktop Search can also search inside Adobe .PDF files.) Also note that the last file in the list is located in a sub-subfolder of C:\Scripts, which means that we really are searching the entire C:\Scripts folder tree.

And that should do it, PB. Like we said, there might be other ways to accomplish this task, but Desktop Search makes the process very fast and very easy. And because Desktop Search has so many other cool capabilities, it’s worth looking into anyway. (For more information, and more sample code, see our article Seek and Ye Shall Find.)

In case any of you are wondering, we should come clean and admit that the Scripting Son was right: the shot his father made to win the game really was pure luck. In fact, the Scripting Guy who writes this column wasn’t even trying to make a basket: he was just hoping that, maybe, he could throw the ball off the backboard and then grab the rebound. No one was more surprised that he was when the ball actually went through the hoop.

But, remember, that’s between us: the Scripting Son doesn’t really need to know how lucky the Scripting Guy who writes this column was. Instead, let him think that – for once – his Dad actually knew what he was doing.

You know, you’re right: there probably isn’t much chance of that happening, is there?

And for good reason.

0 comments

Discussion is closed.

Feedback usabilla icon