Getting to Grips with Problems

First of all, thanks to my colleague John Allder for prompting me on the topic of root-cause analysis or more simply put: getting to grips with problems.

The phrase ‘root cause analysis’ is often used in a general sense to describe the activity of identifying the underlying cause of an incident.  However, the phrase Root Cause Analysis (RCA) is also given to a specific technique that is intended for use in investigating a series of actions or occurrences that lead to an undesired outcome.

Every major problem should be reviewed to learn lessons for the future.
• What was done correctly
• What was done wrong
• What could be done better in future
• How to prevent recurrence
• Whether there has been any third-party responsibility and whether follow-up actions are required

RCA helps to identify not only what happened and how it happened but also why. Only by understanding why will we be able to devise workable corrective measures. For instance, suppose a network technician disconnects a working router rather than a broken one. A typical investigation might conclude that human error was the cause and recommend better training or that technicians should take more care but neither of these is likely to prevent future occurrences. RCA assumes that mistakes do not just happen but that they have specific causes, and would ask ‘why?’ In the case of the poor network technician the RCA analyst might ask ‘was the router properly labelled?’, ‘was the technician told which router was faulty?’, ‘is there a recognised procedure for deciding whether a router is working or not?’, ‘did the technician know what it was?’

Root causes have four characteristics:
1. They are specific causes: ‘human error’, for example, is too general.
2. They are causes that can reasonably be identified: RCA must be cost beneficial so the analyst must know when to stop the investigation.
3. They are within the control of the management of the organisation. The analyst is looking for causes that can be addressed by the organisation. Although adverse weather conditions might very well have triggered the incident, we cannot do anything to affect the weather and so that is not an appropriate root cause. We can of course do something about how we are impacted by adverse weather and perhaps our root cause / resolution might lie there.
4. They can be addressed by specific solutions. A vague recommendation such as ‘ensure that technicians follow defined procedures’ is wholly inadequate and probably means that more thought needs to be given to identifying the specific cause.

RCA is a specific discipline. It follows four distinct phases:

• Data Collection
• Charting
• Root Cause Identification
• The Development of Recommendations

Carried out properly, Root Cause Analysis will ensure that an organisation learns all of the lessons from a major disruption to service and reduce the risk of future failures. It will help staff to identify ways not only of reducing the likelihood future disruption, but also of limiting the impact of any disruption that does occur.

http://www.sysop.co.uk

Five Steps that can help you to achieve success with ITIL adoption

I wrote a couple of weeks ago about the panel debate hosted by Michelle Major-Goldsmith at the Service Desk Show at Earls Court. The panel, made up of Paul Wilkinson (Gaming Works), Kevin Holland (UK Public Sector) and Stephen Mann (Forrester) debated the success (or otherwise) of ITIL adoption in IT organisations.

Stephen summarised his key messages as “Five Steps that can help you to achieve success with ITIL adoption”. They are very pertinent and worthy of being repeated here.

Step No. 1: Be clear on what ITIL is all about, especially the importance of people. Ensure that as well as thinking about process and tools you plan how you will manage the cultural and organisation change issues. Ignore these at your peril! Everybody needs to know about VOCR (Values, Outcomes, Costs & Risks).

Step No. 2: Be realistic about existing ITSM process maturity and improve them gradually. Establish a baseline and use the CSI model to help you keep your thinking on track.
o What do we want to achieve? (Our Vision)
o Where are we today? (Our Baseline)
o What does success look like? (CSFs and KPIs)
o How will we get there? (Our Project Plan)
o Did we get there? (Our measurements against the baseline)
Trying to implement too many processes at once is like doing two jobs badly rather than one well. Remember the quick wins and look at the user facing process too. If you can achieve success there it is very visible and it promotes a good vibe.

Step No. 3: Evaluate technology only after you’ve addressed goals, people, and processes. Remember ‘a fool with a tool is still a fool’. The fanciest looking service management tool in the world won’t help you if you don’t have people on side and process and roles and responsibilities mapped out. Ensure a holistic approach. Use the 5 P’s. People, Process, Product, and Partners aligned to achieving the 5th P ‘Performance’ (VOCR)

Step No. 4: Consider the overall vision including short, medium, and long term goals. You need to be in it for the long haul. Remember service improvement should never stop! Continual Service Improvement starts at the beginning of your endeavours and not at the end, despite what it might look like in the ITIL Lifecycle diagram.

Step No. 5: Regularly communicate the value of ITIL and involve the IT and non-IT stakeholders. Measure your success and compare with your baseline. Reward staff and keep on reminding your customer about how success in IT is translated to success in terms of business productivity. Keep talking to them and think about OUTCOMES!
Finally – Turn knowledge into results:
The panel concluded that the delegates at the session probably had the knowledge to make ITIL adoption work. But often said they were short of time.

This is an excuse every IT organization uses at some time or other. It isn’t a question of time it is a question of priority. Think about VOCR and set your priorities accordingly. If you don’t have time to do justice to an ITIL project ………….don’t start it.

If time, focus and priority are the issues for your organisation then of course help exists through the various service management consultancies, I shamelessly plug my own! http://www.sysop.co.uk/professional-services

Don’t rush into sheep dipping staff through ITIL certification. There are other ways. Better to plan what you want to achieve and the journey that will take you there. ITIL certification may well be part of this journey but it isn’t the entirety of it!

The Real Value of SLAs

It’s really quite remarkable just how many organisations struggle to implement Service Level Agreements (SLAs). They’re documents that lie at the very heart of service management. They are absolutely fundamental. So why is it so difficult? Why are they not given the priority they deserve?

Well, first of all, they are an agreement and that means there has to be a dialogue!

To draw up an agreement between the internal customer and the internal service provider one has to agree and understand what the service is intended to achieve; what business objectives does it help deliver; and to ensure that specific and measurable service targets are agreed, measured and reported on.

A good starting point would be to begin the dialogue with a review of the current de-facto service levels and to gather the information about the business drivers so that the customer is reassured that these are fully taken on-board as discussions continue and the SLA takes shape.

Second of all, the SLAs have to add value to all parties and in particular to the IT service provider. I have seen many instances where SLAs have been written and now gather dust in a filing cabinet unlikely ever to see the light of day again.

Examples of “adding value” might include the ability to respond to interrogations such as: which SLAs are scheduled for review? Which customers are dependent on this SLA? If I have to change an operational level agreement (OLA) or an under-pinning contract, which SLAs will be affected?

In my experience, the greatest value of all that comes from the SLA dialogue is the increased understanding of the importance of the Service Design stage of the service lifecycle. Service reviews often reveal weaknesses in the current service brought about by poor design. The lessons learned here not only lead to improvements in the particular service under review but can lead to improvements in the service design process to the benefit of all future services.

There are few tools out there to support SLA management. We have such a tool (Smart-SLA) in our bag that helps and plans to extend its functionality to include OLAs and under-pinning contracts. Details can be found on our website: http://www.sysop.co.uk/smart-sla