Can you see the Join ?

I said I couple of posts back that I have to remind myself that I haven't been doing PowerShell long enough to lay claim to being an expert, but I think I've found one aspect of it which might not be fully appreciated.

Calculated fields can run any code.  

I've just used it to create the equivalent of a SQL Join between two CSV files. I used this ability  before

 Function Choose-Process
{$global:Counter=-1
$proc=Get-Process
Format-Table -inputobject $Proc  @{ Label = "ID"; Expression={($global:counter++) }}  , processName
$proc[(read-host "Which one ?")] }

So where other people would use one or fields from their data, I'm using a variable which I increment. OK. What's the big deal ?

I was writing about geotagging photos with PowerShell the other day; but -to date - I haven't been collecting much GPS data. But I do have some photos - a lot of photos - with information which I want to apply to them: my scuba photos. On every dive I wear a Suunto wrist mounted "computer" which tracks the dive and tells me when I need to come up (or stay down) if I want to avoid getting the bends. It downloads the information it has logged to my PC so I can see a dive "profile" - a plot of depth against time - and the Suunto Dive manager software* helps me record information about the dive like where it was, who I dived with and so on: it can dump output as a set of CSV files. One file describes the dive itself (one line per dive), the another lists the "Profile" records (many lines per dive) Here's a list of the fields I'm interested in.

In "MyDives.csv" In MyDives$pro.csv
UniqueDiveID UniqueDiveID
DiveDate SegmentNumber
TimeOfDay SegmentDepth
SampleRate,  
Location  
Site  
WaterTemperature  

Time of day is when the dive starts, SampleRate says the computer samples every twenty seconds, and in the profile, SegmentNumber is how many twenty second intervals into the dive we have gone. Once upon a time I would have imported these into Access and done a query to join the two tables, calculate a time for each data point and merge fields to get description. These days I'm doing everything in PowerShell. So... Step one, get an array of objects representing the dives - it's one line Import-CSV | Select object {properties}  but broken up for easy viewing

 $Dives=(import-csv "MyDives.csv" | select-object -property 
            UniqueDiveID ,
            @{Name="DateTime"; expression={[DateTime]::ParseExact(($_.DiveDate + $_.TimeOfDay),
                       "dd/MM/yyyyHH:mm",[System.Globalization.CultureInfo]::InvariantCulture)}} , 
            SampleRate ,  
            @{Name="Description"; expression={$_.Site +"," + $_.Location + ": "+ $_.WaterTemp + "°C"}} 
        )

Import-CSV | Select object is easy stuff. No it's not a typo, in Format-table the calculated fields have a LABEL and in Select-Object they have a NAME. One calculated field merges the date and time, and forces it into a DateTime type (which also sorts out the issues of US formatted and sensible formatted dates). The other calculated fields give me a description like "Horsea Island, Portsmouth: 4°C" or  "Black Coral Forest,Turks and Caicos: 28°C" . All very neat, but not wildly clever, step two is to bring in the profile information and for each data point, work out the absolute time, and return the depth and description.  Again in ONE Import-CSV | Select object construction: although I've put a where in to reduce the number of dives I'm going to process. Here it is , again formatted to help readability

 $profileInfo (import-csv "Profiles.csv" | where-object {[int]$_.uniqueDiveID -gt 150} | select-object `
              @{name="DateTime"; expression= 
                     {$global:DiveID=$_.UniqueDiveID
                      $global:dive = ($dives | where-object {$_.UniqueDiveID -eq $global:DiveID})
                      $global:dive.DateTime.AddSeconds([int]$_.SegmentNumber * [int]$global:dive.SampleRate )}} , 
              SegmentDepth , 
              @{name="Description"; expression={$_.SegmentDepth +"M - " +$global:Dive.description}}
             ) 

WHOA ! what on earth is going here. Lets pull the calculated fields apart - they're where the clever stuff happens.  

  • I've got some variables which I've made global (that deals with any problems of scope which I had in the Choose-process code above).  
  • The first field is named DateTime. The code for it says, Get the DIVE which joins this profile entry on the Unique Dive ID field (hold onto it, we'll need it again). Return a new date time which is 
    Dive.Starttime + (Profile.SegmentNumber * Dive.SampleRate), Incidentally multiplying 2 strings gives an interesting result, hence I force them to be integers.
  • Then I return the depth for that sample, both alone  and combined with the description for the dive, which we found in the other calculated field

So now I have

DateTime SegmentDepth Description
09/11/2007 10:17:20 5.8 5.8M - Black Coral Forest,Turks and Caicos: 28°C
09/11/2007 10:17:40 12.8 12.8M - Black Coral Forest,Turks and Caicos: 28°C
09/11/2007 10:18:00 20.7 20.7M - Black Coral Forest,Turks and Caicos: 28°C
09/11/2007 10:18:20 25.6 25.6M - Black Coral Forest,Turks and Caicos: 28°C

 

One more calculated field gives gets me to record for the photo I'm looking at. Of course the time which links the photo to this record isn't an exact match - and the clock on my camera was on the wrong time zone and 3 minutes 28' fast (I took a picture of it so I could work this out later), but I can use the same "Fuzzy match" method I used in the post about GPS "| Sort-on (Absolute Difference between GPS time and photo time) | select the first one"   It's a another calculated field - this time a in a Sort-object .

$profileInfo | sort -Property @{ Expression={[system.math]::abs(($_.datetime - $myphoto.DateTimeTaken.AddSeconds(-(28+(60*3)+(3600*6)))).totalMilliseconds)  }}
| select-object -First 1

So in the case of the Shark picture what I get is

DateTime SegmentDepth Description
07/11/2007 10:35:20 18.9 18.9M - Rock and Roll,Turks and Caicos: 27°C

 All set and ready to set the GPS altitude field and description fields in the photo. I've said it before, and this won't be the last time I say it . The efficiency of doing this in powershell amazes me. Here's the pseudo code tagging a whole holiday's photos.

Add headers to Dives.CSV
Add headers to Profiles.csv
$Dives=[as above]
$ProfileInfo=[As above]
get-childItem *.JPG | forEach {
  $myPicture = Get-Picture ($_.name)
  $DiveData=$profileInfo | sort [as above] | selectObjectFirst
  $MyPicture.Description=$DiveData.description
  $MyPicture.GPSAltitude=-($DiveData.Segment Depth)
  $MyPicture.Save }

For the next trip I'll have the GPS unit with me and I'll probably at latitude / longitude with depth as "negative altitude".

Technorati tags: Microsoft, Powershell

*Foot note. I prefer the older version of this software, but I think the feature I'm using is unchanged in the new version