Incident management, from a system administrator's perspective

Before I learned about ITIL, I was a system administrator trying to figure out how we could "do things differently." It seemed like we were doing the same work that a bunch of IT shops were doing, and maybe there was something already written that we could learn from.

One of the books I found was The Practice of System and Network Administration, by Tom Limoncelli and Christine Hogan. (Since then, a second version of this book has come out.) It's an extremely thorough introduction to the "soft topics" of systems administration (and it gets points for being written in LaTeX).

I refer to one of the chapters of this book whenever I think about ITIL's incident management. Chapter 16, "Customer Care," describes how to handle incidents:

  • Phase A: The Greeting
    • Step 1: The Greeting
  • Phase B: Problem [read "Incident"] Identification
    • Problem [read "Incident"] Classification
    • Problem [read "Incident"] Statement
    • Problem [read "Incident"] Verification
  • Phase C: Planning and Execution
    • Solution Proposals
    • Solution Selection
    • Execution
  • Phase D: Verification
    • Craft Verification
    • User Verification

The book goes on to describe the most appropriate internal roles to perform each step, and gives some entertaining examples of what happens when a step is skipped.

Compare this to ITIL v3's model (ignoring the decision points):

  • Incident identification
  • Incident logging
  • Incident categorization
  • Incident prioritization
  • Initial diagnosis
  • Investigation and diagnosis
  • Resolution and recovery
  • Incident closure

and the models are quite complementary! The first model emphasizes how to arrive at the correct solution, by brainstorming options; the second model emphasizes logging and prioritization.

Individual site contributors are solely responsible for the content of this web site.