Service Management: I need to stop the clock on Incidents?

One question I get asked all the time is can we stop the time for a SLA if it’s outside our control?Building Clouds Blog

Situation

Let’s look at this from two common scenarios:

Vendor scenario: An incident has been raised and has been assigned to a vendor, the vendor is not getting back and therefore the incident has breached the SLA. This results in the scorecard showing red for this incident and impacting the overall SLA Cloud and Datacenter Solutions Hubtarget for incidents

Customer Scenario: A customer has raised an incident, IT needs more information from the customer and therefore ask the customer for more information. The user does not get back to IT and therefore the incident will breach the SLA.

Problem

Why is it a problem that the Service Desk wants to stop the time? In most cases IT wants to stop the time because they want to exclude external factors which they do not control. What they are trying to accomplish is “exception reporting”

In the Vendor scenario, the business will tell IT to manage their subcontractors (UC). The time allotted to resolve an incident should encompass the targets that the subcontractor sets, a situation where you want to under promise and over deliver. For example if the target for the subcontractor is to respond/resolve incidents in 2 hours, the overall SLA should take in account the vendor response time as well as your internal IT’s time for incident resolution. If your SLA is beached due to Vendor, then it is time to review the contract (Underpinning Contract) and set better expectations on what your organization needs. SLA documents what is expected from the Customer point of view and the contract (UC) is about managing “the chain of expectation”.

Customer Scenario: what is different in this scenario that there are many internal groups involved with the delivery of IT Services? The Service Desk owns the incident in the eyes of the Customer. When the Service Desk refers an incident to another IT Support group, they are often caught in a hard place when the expected resolution of the service is not met. The SLA is owned by all of IT, but in many cases when incidents are transferred out of the Service Desk and not resolved in time, the Service Desk wears the blame. SLAs in an all internal IT organization still need to set expectations on both sides and within all of IT. SLAs should take in account that incidents will have to leave the Service Desk for resolution and that all of IT owns the SLA target with the Customer. You must ensure that the SLA discussions are realistic make both IT and the Business happy.

Solutions

Customer Scenario: Keep in mind that delivering IT Services to the business is not an Ice hockey game with an effective time, where the clock only counts when the game is on. There is an associated business cost with having people waiting for customers getting back to them to resolve their incidents. There should be some IT policies defined to aid IT in getting timely feedback from Customers, where incidents can’t be hold indefinitely. In one organization I worked at, we would call the Customer back, and reduce their priority of their incident if they were unable to get back within the day. Also if an incident is on hold for more than XX days, you can resolve the incident and notify the Customer that if the issue is still outstanding upon their return, that they can reactivate that incident. Setting expectations for both the business users of IT and IT of what constitutes a critical incident and what the expectations from each party in resolving this incident. If the lack of Customers getting back regarding Incidents in your organization is prevalent, you need to address the policies for Incidents and level set the expectations for the Customer and IT.

Vendor Scenario: The Business Customer’s perception does not change because about stopping the clock on incidents. The idea behind stopping the clock, is hurtful to IT, as they are masking the real issue of the behaviors between IT and the Business. Incidents are critical service impacting (i.e. money costing) issues. If IT services are down and the appropriate people are not available to come to a timely resolution, this is an issue with what the priority of the reported incident. Down IT services impact the delivery of service to the business and their Customers. Having long running incidents impact a Business in serving their Customers, therefore the focus should be on all parties working together to resolve the Incident.

Being unable to get the reporting Customer to work with a Vendor to solve an incident is a business problem. It’s about establishing a communication and ensuring that the right KPIs a captured and measured as described. For example a report showing the number of resolved tickets, where closure code / status was on hold due to Customer needing to contact IT is a better way to demonstrate the actual cause of Incidents exceeding SLAs.

Trying to manage and control the behavior of your Customers is very difficult to do. Stopping the clock is enabling the lack of responsibility and accountability between the Customer reporting the Incident and the Analyst assigned to resolve this Incident. If you can manage and control Customer’s getting back to IT, then you need to manage the behavior by reporting on the real issue. That would the main argument for stopping the clock on Incidents, you are only enabling the Business to not respond back to IT in a timely manner. Building Service Level Agreements is an agreement between IT and the Business and managing and setting the rules of engagement and expectations from both parties.

Conclusion

Instead of focusing on stopping the clock go out and set expectations with the business and focus on how you can better identify and drive “real” improvements and KPIs

Sharing false reports on what really is happening between IT and the business will result in no opportunity to improve the relationship between IT and the Business. Stopping the clock is only enabling the Business not to respond timely to IT and is not producing a report on the actuality of the state of the relationship, so the effort of stopping the clock does not provide “Business” value.

As the last option in case you are still not convinced and in case the IT organization is not ready to take the discussion with the business why it’s a bad idea to stop the time I can reference this solution.

 

Hope this gives you something to consider weather if you agree or not to my arguments of stopping the time.

 

Thanks to my ITIL friends Lasse Wilen, Peter Ravnholt and Kathleen Wilson for listing in and for giving feedback on this topic.


Go Social with Building Clouds! Building Clouds blog Private Cloud Architecture Facebook page Private Cloud Architecture Twitter account Building Clouds Twitter account Private Cloud Architecture LinkedIn Group Cloud TechNet forums TechNet Cloud and Datacenter Solutions Site Cloud and Datacenter Solutions on the TechNet Wiki