This blog post will cover how you can set up incident resolution service level agreements (SLAs) in Service Manager 2010. There are some things that we still need to add support for in this area, but we’ll at least explain what you can do with SCSM today in this blog post. I’ll also point the areas that are gaps we need to fill and later we’ll announce when and how we are going to fill those gaps.
To begin with, let’s talk about what you can do in SCSM 2010. Service Manager out of the box has support for SLAs based on the target resolution time. Another common SLA metric is target response (“acknowledgement”) time. We don’t have support for that out of the box right now.
Target resolution time is determined by the incident priority. If you go to Administration\Settings\Incident Settings you will find this dialog:
For each priority level (1-9) you can define a different Target Resolution Time. The Target Resolution Time is defined as the Time Created + Target Resolution Time for the incident’s current priority. For example, if I created an incident with priority = 3 at 8:00 in the morning, I would have until 12:00 noon to resolve that incident. If the incident status changes to Resolved prior to 12:00 then I have met the SLA.
The Target Resolution Time is always displayed at the top of the incident form as the “Resolve By:” field.
This incident is Priority = 4 and per the matrix above has a target resolution time of Time Created (9/21/2010 12:59 PM) + 4 hours = 9/21/2010 4:59 PM.
The priority value is not directly settable in the UI because it is a function of the Impact and Urgency values. In the example above when Impact = Low and Urgency = Medium that is configured to have a priority of 4.
You can also add additional Impact and Urgency items in the Library\Lists view and then you can work with a larger matrix:
Note: changing either these settings or the target resolution time settings above will not take affect until after you close and restart the console and they will not retroactively be applied to incidents. If the incident urgency or impact value changes when the workflow runs to evaluate it’s target resolution time again it will use the updated settings.
Assuming we go with the default configuration of Urgency: High, Medium, Low and Impact: High, Medium, Low at this point we have established the following pattern:
|Urgency||Impact||Priority||Target Resolution Time|
That alone might be good enough for some customers, but a lot of people want to map different SLAs for different customers, different classifications of incidents, different services, different affected configuration items, etc. First lets work this out on “paper” like this for a situation where we want to have different SLAs depending on how important a user is in an organization (from an IT guy’s perspective 🙂 ).
|Scenario||Urgency||Impact||Priority||Target Resolution Time|
|Affected User’s title is ‘CEO’||High||+||High||=||1||–>||30 minutes|
|Affected User’s contains ‘IT’||Medium||+||High||=||2||–>||2 hours|
|Affected User’s title contains ‘Manager’||High||+||Medium||=||3||–>||4 hours|
|Affected User’s title contains ‘HR’||Low||+||High||=||4||–>||1 day|
|Affected User’s title contains ‘Engineer’||Medium||+||Medium||=||5||–>||2 days|
|Affected User’s title contains ‘Senior’||High||+||Low||=||6||–>||7 days|
|Affected User’s title is ‘Janitor’||Low||+||Medium||=||7||–>||2 weeks|
|Affected User’s title contains ‘Marketing’||Medium||+||Low||=||8||–>||4 weeks|
|Affected User’s title is ‘the guy with the stapler in the basement’ 🙂||Low||+||Low||=||9||–>||52 weeks|
You can do the same kind of thing for other types of schemes including mixing and matching criteria using OR/AND statements:
|Scenario||Urgency||Impact||Priority||Target Resolution Time|
|Incident Classification = ‘Network Outage’||High||+||High||=||1||–>||30 minutes|
|Incident Classification = ‘HR App Down’ AND Affected User Title contains ‘Manager’||Medium||+||High||=||2||–>||2 hours|
|Incident Classification = ‘Finance App Down’||High||+||Medium||=||3||–>||4 hours|
|Incident Classification = ‘Printer Down’ OR ‘Printer Out of Paper’ OR ‘Network Slow’||Low||+||High||=||4||–>||1 day|
|Incident Classification = ‘Disk Space Low’||Medium||+||Medium||=||5||–>||2 days|
|Incident Classification = ‘Disk Space Low’ and Support Group = ‘Test Environment Support Team’||High||+||Low||=||6||–>||7 days|
|Incident Classification = ‘Other’||Low||+||Medium||=||7||–>||2 weeks|
|Incident Classification = ‘Maintenance’||Medium||+||Low||=||8||–>||4 weeks|
|Incident Classification = ‘Games’||Low||+||Low||=||9||–>||52 weeks|
It’s really up to you how you want to classify these things, but in the end (at least for SCSM 2010) you have to map these all down to a certain pair of Urgency and Impact values which in turn drives Priority which in turn drives the Target Resolution Time.
Now the question is “How do you implement this map that you have created on paper?” There are a few different ways:
- When an incident is created by one of the connectors, apply a template which is appropriate for the type of issue. The template should be configured to set the Impact and Urgency appropriately along with any other additional properties and relationships that are appropriate to route and classify the incident. Examples where you can do this are:
- SCOM Alert –> Incident scenario
- SCCM Desired Configuration Management –> Incident scenario
- Upcoming Exchange connector
- Use the Incident Event Workflow to apply a template to update the Urgency and Impact appropriately as incidents are created or updated. For example, if an incident changes from Classification = ‘Network Slow’ to ‘Network Down’ you would want to apply a template which sets the Urgency = High and the Impact = High. Another example could be that whenever a new incident is created, regardless or source and regardless of what the initial Urgency and Impact values are, if the Affected User is the CEO then apply a template which changes the Impact and Urgency to High.
- When analysts create a new incident in the console, have them apply a template to populate the incident which sets multiple properties and relationships such the affected service, classification and Impact/Urgency values appropriately at the same time.
A couple of important notes:
- There is a workflow in SCSM out of the box that changes the Priority and Target Resolution Time property value each time there is a change in either Urgency or Impact. So – even if the incident is updated via a template being applied in a workflow the Priority and Target Resolution Time will be updated within a few seconds to match the new Urgency and Impact values.
- The data is sent to the data warehouse on a schedule. Changes to Urgency, Impact, Priority, and Target Resolution time will not be reflected in reports until the data has had a chance to go through the Extract, Transform, and Load process which takes about an hour or so on average.
Now, let’s point out some of the limitations currently in SCSM 2010:
- Target resolution times assume a service desk that is operating 24×7. There is no way to override this out of the box, but Patrik and I may provide a customized solution to this at some point.
- There is no way to add additional priorities beyond 1-9.
- There is currently no way to subscribe for incidents in the Incident Event Workflow with criteria that traverses relationships where the max cardinality is > 1. This includes relationship types such as affected Configuration Items and affected Services.
- There is no way out of box way to send a notification or apply a template when an incident has exceeded its target resolution time or is about to exceed its target resolution time. You can use the Incident SLA Management CodePlex solution to do this though.
- There isn’t a Response Time SLA capability out of the box although it is possible to add support for that as described in this blog post.
- There is no way to capture the SLA document itself in SCSM. It would be really easy to create a new class called ‘SLA Agreement’ though. You could define some simple properties and relationships on this class and attach documents to it through the Related Items tab.
We are working on addressing these issues as soon as possible, but in the meantime you can start to map your SLAs to SCSM using the approach described above for target resolution times.
Another thing you can look into is a solution provided by our partner Cased Dimensions that provides Service Level Management. Check out the Cased Dimensions demo video.
Hope that helps clear things up!
Please leave any helpful comments or suggestions in the comments below so we can factor those in for future improvements.