An ITIL Process Conundrum

A student brought me up short on a recent course when we were discussing the distinction between incident management and problem management. We had been talking about the need for Incident Management to resolve the user’s issue quickly and the purpose of Problem Management to identify the root cause and provide a workaround or a permanent solution.

The scenario surrounding this particular conundrum goes like this. . . .

• A user contacts the service desk and complains that a particular document he is trying to print fails when sent to his local departmental printer.
• The service desk asks the user to redirect the print to a different printer (different manufacturer) two floors down.
• This is successful albeit very inconvenient. The service desk agrees with the user that this issue has been resolved and the incident can be closed.
• The service desk agent believes that this issue needs to be investigated further and raises a problem record.

• The problem management team eventually identify that there is a deficiency in the printer firmware and ask the manufacturer to provide a fix.
• Meantime, based on the experience from the incident, a Known Error record is generated containing details of the workaround that was successfully employed by the originating user.
• This is used on several occasions in the following months to resolve further similar incidents albeit with considerable inconvenience to the users concerned.

• Eventually, the printer manufacturer comes up with an updated version of the firmware. This is tested, found to be a valid solution and a change request is raised to roll the new version out to every printer of this make and model.
• No further incidents are raised.

• Sometime later the original user, based on prior experience, is directing his printed output to the printer two floors down. A colleague asks him why he is doing this. “Because our departmental printer can’t cope with this particular type of document” he replies.
• Well, I don’t have any trouble says the colleague – prompting to original user to try the local printer which, of course, works perfectly.

The IT service provider has clearly let down its customers/ users. But whose responsibility was it to advise the user-base in general, and this particular user in particular, that the workaround was no longer necessary. What went wrong? How would you change processes to improve the communication flow?
Stuart Sawle
http://www.sysop.co.uk

7 thoughts on “An ITIL Process Conundrum

  1. Existing processes should be sufficient. Problem Management identifies the root cause and asks the printer manufacturer to supply a fix. This fix is then implemented under the control of Change and Release which mandates user testing of the fix. Who better to do the testing than the user who reported the incident in the first place so he is now aware that the workaround is no longer necessary.
    It would also be wise to implement some communications with all users in this area of the business

    • I like the idea of the end user testing the fix. “Killing two birds…”. I also agree that a broader communication would cover the eventuality of multiple users being similarly affected.

  2. I came across situations like this in a former role. When we were rolling out a change, we sent out a notification to users that it was happening and asking them to log a call with the Service Desk if they experienced any problems (notice the small ‘p’!) the next morning. This notification would contain details such as ‘Printer B on floor X will now accept all document types’. The responsibility for communication lay firmly with Change Management.

  3. There is a drive in every service desk I have seen to get tickets, of whatever sort, closed off as quickly as possible. In my view, this is where the opportunity to feed back useful information to end users is lost.

    By definition, there will almost always be more incident records open than problem records at a given moment and so incident records should not stay open. If a problem record results in a Known Error record *and* a permanent fix/change record, all three of these (linked?) records should remain open until the end users (whose details should be stored as rows in the problem record) are contacted. This contact should be done by “blanket” email and/or an update on the company’s intranet. Only then should the problem record and change record be closed. The Known Error record should be placed in an “inactive” or “archive” status.

    As the problem record is central to this, I think the Problem Manager holds the key to making it work.

  4. Interesting point.

    I believe “you” are supposed to associate every Incident with the Problem record. One could then, when the problem has been resolved, send an automated email to every person that had an Incident open associated to the Problem record. The email can have a simple statement – something like:


    Hello (Name Here),

    Our records show that on (Date of 1st Incident Ticket here) you had an issue with (Name of Service here). We are very sorry about that, however we do have some good news to report! We have (finally!) produced a fix for the problem. You should now be able to resume normal use and operations. Please contact us if you experience any future issues.

    Yours Truly,

    The Service Desk/The IT Department/Name of CIO/Cookie Monster

    —-

    Of course you could be more specific but then your “automated scripting” would get more complicated. You could have it so it is a “free form” text box that is filled out and then that is what is emailed to everyone – but then again…you will have variations in style, competence, and so on…

    On a side note – your scenario does strike me as realistic. If the problem was with a particular vendor – one would assume that vendor’s printer is in more locations than just that 1. Assuming that is in “most” places of the company – then you have a potential critical problem for the company. I doubt any company would be satisfied waiting months for that fix. Now, lets assume it was not in most places, but only in a few. Again, the loss of work productivity may mean that the better business choice (cost vs benefit) is to remove those printers with the ‘other’ vendors printers – then seek compensation from the vendor that caused the problem – knocked down future prices, deals on other products, etc. I highly doubt anyone would truly wait around months for a fix to a printer issue.

    • That’s roughly how we do it, although we allow our service management tool to do the hard work.

      When our problem is confirmed as being removed then the solution is pushed to the Known Error record and all the associated incidents (open or closed) which in turn sends an email to the requestor with this update.

      • Thanks Andy
        That confirms my view. The key to this is the Known Error Record. This will contain a total of the number of incidents resolved using the KE information. Searching the incident records for the cross-reference to the KE will provide a list of users affected. Service Transition (Relaese & Deployemnt Management) should take repsonsibility for communicating with those users.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s