Cluster and Stale Computer Accounts

Hi, Mike here again. Today, I want to write about a common administrative task that can lead to disaster: removing stale computer accounts from Active Directory.

Removing stale computer accounts is simply good hygiene-- it’s the brushing and flossing of Active Directory. Like tartar, computer accounts have the tendency to build up until they become a problem (difficult to identify and remove, and can lead to lengthy backup times).

Oops… my bad

Many environments separate administrative roles. The Active Directory administrator is not the Cluster Administrator. Each role holder performs their duties in a somewhat isolated manner-- the Cluster admins do their thing and the AD admins do theirs. The AD admin cares about removing stale computer accounts. The cluster admin does not… until the AD admin accidentally deletes a computer account associated with a functioning Failover Cluster because it looks like a stale account.

Unexpected deletion of Cluster Name Object (CNO) or Virtual computer Object (VCO) is one of the top issues worked by our engineers that support Clustering and High-Availability. Everyone does their job and boom-- Clustered Servers stop working because CNOs or the VCOs are missing. What to do?

What's wrong here

I'll paraphrase an article posted on the Clustering and High-Availability TechNet blog that solves this scenario. Typically, domain admins key on two different attributes to determine if a computer account is stale: pwdlastSet and LastLogonTimeStamp. Domains that are not configured to a Window Server 2003 Domain Functional Level use the pwdLastAttribute. However, domains configured to a Windows Server 2003 Domain Functional Level or later should use the lastLogonTimeStamp attribute. What you may not know is that a Failover Cluster (CNO and VCO) does not update the lastLogonTimeStamp the same way as a real computer.

Cluster updates the lastLogonTimeStamp when it brings a clustered network name resource online. Once online, it caches the authentication token. Therefore, a clustered network named resource working in production for months will never update the lastLogonTimeStamp. This appears as a stale computer account to the AD administrator. Being a good citizen, the AD administrator deletes the stale computer account that has not logged on in months. Oops.

The Solution

There are few things that you can do to avoid this situation.

  • Use the servicePrincipalName attribute in addition to the lastLogonTimeStamp attribute when determining stale computer accounts. If any variation of MSClusterVirtualServer appears in this attribute, then leave the computer account alone and consult with the cluster administrator.
  • Encourage the Cluster administrator to use -CleanupAD to delete the computer accounts they are not using after they destroy a cluster.
  • If you are using Windows Server 2008 R2, then consider implementing the Active Directory Recycle Bin. The concept is identical to the recycle bin for the file system, but for AD objects. The following ASKDS blogs can help you evaluate if AD Recycle Bin is a good option for your environment.

Mike "Four out of Five AD admins recommend ASKDS" Stephens