Sunsetting TMG 2010 with some (free!) Best Practices

Long and boring post ahead. So: KITTENS! There. Fluffy now.

As one of the Premier Field Engineers performing ISA Server Health Checks and then Threat Management Gateway (TMG) configuration reviews (by default, from my long association with Proxy 2.0 and then ISA), I was reviewing a document I put together for a customer just before shredding it, and thought:

You know what? Everyone should do these things! These recommendations are common enough that I seem to make them every time I see a TMG box… so why not generalize and recommend them here? Put them out into the wild. Get them shouted down. Give them their time in the sun.

So on the off-chance you’re a survivor of the TMG Survival Guide and you’re looking for some last-minute as-seen-in-the-real-world TMG corrective advice – and by “last minute”, I mean:

  • You know the base product is in Extended Support until 2020, then it’s going away. (sniff!)
  • You understand that Malware Scanning and Network Inspection System are already frozen at their last update level.
  • You know URL Categorization (Filtering) got turned off already so any rules using it might fail-open (or fail-closed)…

And in terms of pre-migration work

  • You’ve also been through your rule set, and tested that everything’s Least Privilege-compliant,
    • i.e. No broad “everyone can access anything/TMG/anywhere with any protocol” rules or anything like that.
      • No really, if you can connect to TMG via SMB, that’s usually not a good sign… You’re at least using Windows Update for patches, though, right?
  • Maybe you’ve performed an ISAINFO (and/or TMGBPA) export of your rule set so that you can ease the process of recreating them on the next egress device you pick? :)

…Because these are all fantastic first steps on the long migration path between proxies. If you haven't done them, do put them on the list.

So before you shut down TMG that final time, and repurpose the boxes for Quake servers (or whatever you kids use spare boxes for these days)…

What best practices are available to you do in the meantime? Glad you asked!

Here's the short list, the detail follows.

Proactively Protect The Box

  • Install the latest Windows Updates
  • Install the latest TMG Rollup Hotfix (SP2 UR5, potentially + .650 or later)
  • (Install any updates for any other software on the box)

Operating System Protection

  • Firewalling
  • De-Adminning
  • Attack Surface Reduction
  • AV exclusions

TMG Health and Perf

  • Check Tracing isn't enabled
  • Disable/Relax Flood Prevention

And now the details...

Proactively Protect The Box

“It’s a firewall, it doesn’t need patching! ” (just for clarity: that’s not true )

Install the latest Windows Updates

  • If you’re not installing Windows updates, um, I don’t know what to tell you?

You understand that unpatched vulnerabilities win over security settings, permissions and antivirus, right? Any on-box control is potentially circumvent-able by an unpatched (bad) vuln?

And you’re still thinking it’s optional? Well! That’s nice! I hope you’ve a mitigation strategy in place, and an incident response plan for when that one fails.

TMG defends itself pretty heavily against network attack ( (a: by default) (b:to an extent; it still leverages OS components for certain chunks of functionality)), but lots of people end up creating rules which – paraphrased – allow the Internal network to hit any port on the TMG computer. Because reasons!

This is the same pathology which leads people to not patch their CAs, or not to use firewalling between hosts on their internal network – it’s the opposite of a defence in depth approach!

Anyway, back to updates:

  • When I check the update state of a box, I do so by running MBSACLI (the command-line version of MBSA) using the current WindowsUpdate CAB if the box doesn’t have Internet connectivity.

mbsacli /xmlout /nvc /nd /wi /catalog .\wsusscn2.cab /unicode > %computername%-MBSA.xml

    • I actively avoid using the default customer WSUS catalog, because it’s completely possible to be 100% compliant with the WSUS approval policy and have unapproved updates missing from five years ago, which were skipped for a good reason, but then that decision was never revisited.
  • It is uncommon in my experience to find that servers are up to date. For a security appliance at the edge of the network, used as an ingress or egress point by thousands of clients, this is suboptimal.

 

Windows Server 2008 R2 Service Pack 1 is needed for Security Updates

Keep in mind that some updates require the presence of a Service Pack or other major update.

  • So the first thing I’d check is WinVer.
  • If WinVer says you’re on Windows 2008 R2 version 7600 and doesn’t mention a Service Pack, you need to get to 7601 (Service Pack 1) pronto, and then start applying all the updates which have required SP1 - say, the last 4-5 years’ worth, which includes many Critical updates.
  • Windows 2008 should be at SP2. If it’s not at SP2, same thing applies as above.

This, again, is sadly not uncommon.

 

If You Found You Had Something Missing: Why Not Just Use Windows Update?

  • If you find they’re missing updates because { ¯\_(ツ)_/¯ }, my standard remediation suggestion is: just point them at public WindowsUpdate and specify your schedule. Let them pop out through a proxy, or go direct if they’re edge devices.

Yep. I’m serious. Better a security-sensitive device which is up to date by automatic patching at 3am on a Thursday than one which is out of date at all times by policy.

See also: Least Privilege Rule Set. If an attacker can’t hit the vulnerable port, you don’t have that problem.

 

Install the latest TMG Rollup Hotfix

Now, don’t misunderstand me: TMG isn’t the simplest thing in the universe to update (unlike its predecessor ISA Server, which was a positive dream by comparison). But if you’re reading this, you probably work in IT, so that’s not actually an excuse not to do it! :)

Yes, it’s a pain going from RTM to SP1 to SP1 + U1 to SP2 to SP2 Rollup 5, but… you should do it. You need to do it. If you’re one rollup behind, you’re actually 12-18 months of updates out of date. With hundreds of builds in between. Many issues have been fixed over the years, including hangs, crashes, and possibly a security update or two, if memory serves.

  • The latest rollup version I’m aware of is TMG Service Pack 2 with Update Rollup 5. If Help/About in the TMG MMC shows you a version earlier than 7.0.9193.644, well – that update was from 2014.
  • There’s one post-rollup hotfix I’ve seen (which is for SNI websites with HTTPS inspection enabled, but it provides a version bump to .650 for many core components too) which gets us to April 2015: https://support.microsoft.com/en-us/kb/3058679 .

 

Operating System Protection

Lifecycle and post-Lifecycle Firewalling

In April 2020, TMG exits Extended Support and is no more.

But by a quirk of the Support Lifecycle, Windows Server 2008 (and R2) actually exits Extended Support in January 2020, so a TMG box running down the clock will potentially be partially unprotected from an OS security updates perspective between January and April. (Unless a Custom Support Agreement is available, but it’s probably more costly than the alternative). So it’s not a terrible assumption that you’ve basically got until Dec 31, 2019 to get everything sorted out.

  • I don’t mind restating the obvious, so I will: You should have migrated away from TMG before the end of 2019. Please!
    • That’s still 3 years from now to plan and execute your migration
    • So if you haven’t already started, please add it to yourTo Do: 2017” list now.
  • If you do still have some TMG kicking around at that point, consider hardening the TMG Firewall policies (including the System policies) to limit all nonessential connectivity to the TMG hosts by any other computer.
    • In fact, think about doing that anyway, particularly if you actually had work items pending from the “Install Windows Updates” item above. Because that's an attack surface exposure compounded with known vulnerabilities. That's a poor combination for a security device.

If you’re planning to run beyond the end of support, don’t!

But if you do find yourself there: also think about defence in depth approaches. The sort you’d want to take with a Windows 2000 machine on your network if some business unit decided it needed to be added this year: isolate, put external firewalls in front of and behind it, so you seriously limit the ingress and egress paths available to it in case of compromise. Yes, TMG’s a firewall, but trusting {the actions of an on-box firewall which isn’t receiving security updates any more (in 2020)} on {an operating system which also isn’t receiving security updates any more} seems like it’s a bad bet compared to an external security device which is presumably still getting updates. Yah?

 

De-Adminning

  • Just check the membership of any groups who have Admin permission to the box.
  • Then eliminate any local admins except one (if you don’t fully de-admin boxes), and remove any Domain groups you can.

Then, unless you’re sure (I mean certain, i.e. you've checked, not “I assume it’s quite unlikely” ) that a) there's only one local Admin account, and b) the password for that local Admin account is already unique and not known to anyone unauthorized, reset the remaining Admin password to a unique value (unless you’re already a LAPS shop, or use other password management tools… but please, check whether TMG’s part of the LAPS group, don’t just assume it is… that’s how SUS patching doesn’t work too!)

 

Basic Attack Surface Reduction

Most TMG boxes seem to have management agents for something or another installed on them. Actually, as a related observation, it's not uncommon for me to find servers with multiple management agents for multiple generations of monitoring systems on them. Often disused ones. These are pure attack surface additions, and often running with privileged access levels. Very often with known vulnerabilities.

In short: Either kill ‘em, or at least make sure they can’t be contacted over the network (using Firewall policy).

  • If you have looked at them in the last 6 months, you can be excused from this item.
  • If not, check to see what the file dates of the EXEs are. If they’re over 3 years old, they’re probably a liability and almost certainly aren’t being updated, and simply represent an increased attack surface, so consider removing them.

 

Antivirus

Observe the exclusions needed for Antivirus when running on a TMG host. If you don’t exclude the right stuff, it can get a bit jammed up.

 

TMG Health

Tracing?

This one’s much less common than the above few.

  • Run RESMON for a short while, look at the Disk IO area, and sort by Bytes Total/sec. Note any files which have lots of IO over a 2 minute period.
    • (The idea is to try to minimize IO where possible)
  • If activity to ISALOG.BIN is chewing through a megabyte or more per second, TMG may be tracing something
    • or still tracing something - this has been seen when TMGBPA is used to run a diagnostic trace but for whatever reason it doesn’t terminate cleanly.
    • It might also indicate a diagnostic logging session is in progress (just check the console under Troubleshooting –> Diagnostic logging and hit Disable if it isn’t already disabled).

Note that in my experience, some minimal isalog.bin activity (say under 64K/sec) is normal.

If that’s the case, run the ISA Data Packager again, and open the tracing options, then untick everything.

 

Flood Mitigation

Going to say something a bit controversial here: You might want to experiment with turning off or massively increasing the defaults for flood prevention, particularly for outbound scenarios.

The defaults for this feature haven’t changed since it was introduced in 2004, but wow, Internet surfing patterns sure have.

So I say:

  • Try a 10X increase in the numbers, particularly for HTTP and TCP connections, and see how you go.
    • If it stops the constant alerting about “infected clients”, and you’ve got burnout from chasing them down only to find it was Bruce in Marketing opening eighteen instances of FireFox to their brand new multi-pronged CDN-driven site manually, it might be a welcome change, and reduce grumbling (nothing like a paused connection to cause a user to get grumpy about “the )(@$& Proxy”)…

 

And that, believe it or not, covers the most common TMG practices I’d suggest. Minimal TMG, maximal patching and defence in depth.

So there you have it. The most common Stuff I’ve seen over the years with TMG. Now go work out how you’re going to migrate egress to something else… (I assume Azure AD App Proxy will take care of the HTTP stuff, and/or Load Balancer Of The Year for the non-http bits…)