RWPT - BAD TROUBLESHOOTING 101 (part 1 of many) User profile troubleshooting - don't blow them away

 

 

One alarming trend I am seeing on customer sites and I saw when I was in PSS was the common troubleshooting step of blowing away a user profile.  An end user calls in to the helpdesk because they are having trouble logging on, whether they receive an error stating that a temporary profile loaded or that it just takes too long.  Most helpdesk techs or admins take a nice simple path.  They blow away the user profile.  STOP IT.   This is like having a car that doesn't start, so instead of figuring out the problem you break the glass on the dashboard and move the arrow that is pointing to "E" and glue it to the "F." 

There are a few common problems with user profile unloading.  The most common is that the user logged off and an application left an open handle into the profile (file or registry) and that handle is not allowing the user to log on.  I don't know the case numbers but I would say that this is the problem more than 50% of the time as a conservative estimate (my educated guess would be 99%). 

Common event log messages are in the application log coming from UserEnv and have event IDs of 1524/1517 (Windows XP/2003) or 1000 (Windows 2000)

https://support.microsoft.com/?kbid=837115

One of the Escalation Engineers in PSS wrote a tool called User Profile Hive Cleanup or UPHClean to help troubleshoot and often resolve these issues. 

https://www.microsoft.com/downloads/details.aspx?familyid=1B286E6D-8912-4E18-B570-42470E2F3582&displaylang=en

Now, let me get to the point.  Blowing away the profile is BAD NEWS.  The user's private keys are stored there.  The user might have data stored in the profile.  Renaming it and renaming it back is OK, but if your user is engaging in User Autoenrollment and the certificate template specifies to publish the certificate to the user's account then they get another certificate on their account.  User certificates are BY FAR the largest contributor to user-account bloat in Active Directory.  Now, you might say, "I checked the checkbox to have the user's certificate not publish to the directory if a duplicate exists."  That doesn't do you any good for the new profile.  So now you have a chicken-egg scenario.  You wouldn't have gotten in the mess in the first place

Also, how many users like to lose their favorites, or IE cache or cookies?  OK, that might seem minor, but now think about the CEO losing EFS encrypted data.  So you try to rename the profile and you can't because something has it's filthy mitts on ntuser.dat.  So you reboot into safe mode as local admin and blow that puppy away.  Chances are the reboot may have temporarily mitigated the issue. 

You can use UPHClean actively where the service does its best to resolve these issues on the fly OR you can use UPHClean in Reporting mode to have events show up in the event logs - likely the culprit of the issue. 

 

Next time in BAD TROUBLESHOOTING 101 - I will talk about another bad practice I have seen in the field and in PSS...  Which one?  Whichever one is firing me up that day. 

 This posting is provided "AS IS" with no warranties, and confers no rights.