Incident Management | IT Process Wiki (2024)

diese Seite auf Deutschesta página en español

Incident Management | IT Process Wiki (1)

Objective: Incident Management aims to manage the lifecycle of all Incidents (unplanned interruptions or reductions in quality of IT services). The primary objective of this ITIL process is to return the IT service to users as quickly as possible.

Part of: Service Operation

Process Owner: Incident Manager

Contents

  • 1 ITIL 4 Incident Management
  • 2 Process Description
  • 3 Sub-Processes
  • 4 Definitions
  • 5 Templates | KPIs
  • 6 Roles | Responsibilities
  • 7 Example
  • 8 Notes

ITIL 4 Incident Management

The Incident Management process described here (fig. 1) follows the specifications of ITIL V3, where Incident Management is a process in the service lifecycle stage of Service Operation.

ITIL V4 is no longer prescriptive about processes but shifts the focus on 34 'practices', giving organizations more freedom to define tailor-made processes.

ITIL 4 therefore refers to Incident Management as a service management practice, describing the key activities, inputs, outputs and roles. Based on this guidance, organizations are advised to design a process for managing Incidents in line with their specific requirements.

Since the processes defined in ITIL V3 have not been invalidated with the introduction of ITIL V4, organizations can still use the ITIL V3 process of Incident Management as a template.

Note:
In our YaSM Service Management Wiki we describe a leaner set of 19 service management processes that are more in tune with ITIL 4 and its focus on simplicity and "just enough process".

The YaSM service management model includes a process for managing incidents that is a good starting point for organizations that wish to adopt ITIL 4.

Process Description

ITIL distinguishes between Incidents (service interruptions) and Service Requests (customer or user requests that do not represent a service disruption, such as a password reset). Service interruptions are handled through Incident Management, and Service Requests through Request Fulfilment.

The Incident Management process can be triggered in various ways: A user, customer or supplier may report an issue, technical staff may notice a (potential or actual) failure, or an Incident may be raised automatically by an event monitoring system.

All Incidents should be logged as Incident Records, where their status can be tracked, and a complete historical record maintained. Initial categorization and prioritization of Incidents is a critical step for determining how the Incident will be handled and how much time is available for its resolution (seechecklist Incident Prioritization Guideline).

If possible, Incidents should be matched to other Incidents, Problems and Known Errors.

Organizations should use automated resolution tools and provide support portals with self-help information so users can resolve simple Incidents themselves. For other Incidents, 1st Level Support will try to diagnose and resolve the issue, typically using information from a knowledge base or pre-defined Incident Models.

If 1st Level Support is unable to resolve an Incident, it must be escalated to an appropriate specialist support group in 2nd Level Support ("functional escalation"). If required, 2nd Level Support may in turn involve external parties such as suppliers and vendors (in ITIL referred to as "3rd Level Support").

ITIL defines a special process for dealing with Major Incidents (emergencies that affect business-critical services and require immediate attention). Major Incidents typically require a temporary Major Incident Team to identify and implement the resolution.

Once Incidents are resolved, 1st Level Support will formally close them. This includes verifying that the users are satisfied and ensuring that the Incident Record is fully documented (seeIncident Closure and Evaluation). Any new Problems, Workarounds or Known Errors identified during Incident resolution should be forwarded to the Problem Management process.

Incident Management interfaces with a number of other ITIL processes:

  • Event Management may raise an Incident Record if monitoring systems identify a condition that requires a response.
  • Problem Management provides information to the Incident Management process, such as Workarounds and Known Errors. Problem Management uses data collected during Incident resolution for Problem identification.
  • Change Management may be invoked from Incident Management if a Change is needed to resolve an Incident.
  • Configuration Management provides data used to identify Incidents and link them to particular Configuration Items.

The overview diagram of 'ITIL Incident Management' (fig. 1) shows the key information flows and interfaces of the process.

ITIL 4 refers to "Incident management" as a service management practice (see above). The service desk activities are described in the ITIL4 practice of "Service desk".

Sub-Processes

These are the ITIL Incident Management sub-processes and their process objectives:

Incident Management Support

  • Process Objective: ITIL Incident Management Support aims to provide and maintain the tools, processes, skills and rules for an effective and efficient handling of Incidents.

Incident Logging and Categorization

  • Process Objective: To record and prioritize the Incident with appropriate diligence, in order to facilitate a swift and effective resolution.

Immediate Incident Resolution by 1st Level Support

  • Process Objective: To solve an Incident (service interruption) within the agreed time schedule. The aim is the fast recovery of the IT service, where necessary with the aid of a Workaround. As soon as it becomes clear that 1st Level Support is not able to resolve the Incident itself or when target times for 1st level resolution are exceeded, the Incident is transferred to a suitable group within 2nd Level Support.

Incident Resolution by 2nd Level Support

  • Process Objective: To solve an Incident (service interruption) within the agreed time schedule. The aim is the fast recovery of the service, where necessary by means of a Workaround. If required, specialist support groups or third-party suppliers (3rd Level Support) are involved. If the correction of the root cause is not possible, a Problem Record is created and the error-correction transferred to Problem Management.

Handling of Major Incidents

  • Process Objective: To resolve a Major Incident. Major Incidents cause serious interruptions of business activities and must be resolved with greater urgency. The aim is the fast recovery of the service, where necessary by means of a Workaround. If required, specialist support groups or third-party suppliers (3rd Level Support) are involved. If the correction of the root cause is not possible, a Problem Record is created and the error-correction transferred to Problem Management.

Incident Monitoring and Escalation

  • Process Objective: To continuously monitor the processing status of outstanding Incidents, so that counter-measures may be introduced as soon as possible if service levels are likely to be breached.

Incident Closure and Evaluation

  • Process Objective: To submit the Incident Record to a final quality control before it is closed. The aim is to make sure that the Incident is actually resolved and that all information required to describe the Incident's life-cycle is supplied in sufficient detail. In addition to this, findings from the resolution of the Incident are to be recorded for future use.

Pro-Active User Information

  • Process Objective: To inform users of service failures as soon as these are known to the Service Desk, so that users are in a position to adjust themselves to interruptions. Proactive user information also aims to reduce the number of inquiries by users. This process is also responsible for distributing other information to users, e.g. security alerts.

Incident Management Reporting

  • Process Objective: ITIL Incident Management Reporting aims to supply Incident-related information to the other Service Management processes, and to ensure that that improvement potentials are derived from past Incidents.

Definitions

The following ITIL terms and acronyms (information objects) are used in the ITIL Incident Management process to represent process outputs and inputs:

Incident

  • An Incident is defined as an unplanned interruption or reduction in quality of an IT service (a Service Interruption).

Incident Escalation Rules

  • A set of rules defining a hierarchy for escalating Incidents, and triggers which lead to escalations. Triggers are usually based on Incident severity and resolution times. See also: Checklist Incident Priority

Incident Management Report

  • A report supplying Incident-related information to the other Service Management processes.

Incident Model

  • An Incident Model contains the pre-defined steps that should be taken for dealing with a particular type of Incident. This is a way to ensure that routinely occurring Incidents are handled efficiently and effectively.

Incident Prioritization Guideline

  • The Incident Prioritization Guideline describes the rules for assigning priorities to Incidents, including the definition of what constitutes a Major Incident. Since Incident Management escalation rules are usually based on priorities, assigning the correct priority to an Incident is essential for triggering appropriate escalations. See also: Checklist Incident Prioritization Guideline

Incident Record

  • A set of data with all details of an Incident, documenting the history of the Incident from registration to closure. An Incident is defined as an unplanned interruption or reduction in quality of an IT service. Every event that could potentially impair an IT service in the future is also an Incident (e.g. the failure of one hard-drive of a set of mirrored drives). See also: ITIL Checklist Incident Record

Incident Status Information

  • A message containing the present status of an Incident sent to a user who earlier reported a service interruption. Status information is typically provided to users at various points during an Incident's lifecycle.

Major Incident

  • Major Incidents cause serious interruptions of business activities and must be solved with greater urgency. See also: Checklist Incident Priority: Major Incidents.

Major Incident Review

  • A Major Incident Review takes place after a Major Incident has occurred. The review documents the Incident's underlying causes (if known) and the complete resolution history, and identifies opportunities for improving the handling of future Major Incidents.

Notification of Service Failure

  • The reporting of a service failure to the Service Desk, for example by a user via telephone or e-mail, or by a system monitoring tool.

Pro-Active User Information

  • A notification to users of existing or imminent service failures even if the users are not yet aware of the interruptions, so that users are in a position to prepare themselves for a period of service unavailability.

Status Inquiry

  • An inquiry regarding the present status of an Incident or Service Request, usually from a user who earlier reported an Incident or submitted a request.

Support Request

  • A request to support the resolution of an Incident or Problem, usually issued from the Incident or Problem Management processes when further assistance is needed from technical experts.

User Escalation

  • Escalation regarding the processing of an Incident or Service Request, initiated by a user experiencing delays or a failure to restore their services.

User FAQs

  • Self-help information for users supplied by the Service Desk, usually as part of the Support Pages on the intranet.

Templates | KPIs

  • Key Performance Indicators (KPIs) Incident Management
  • Incident Management templates and checklists:
    • Incident Record template
    • Checklist Incident Priority, and
    • Checklist Initial Analysis of an Incident
    • Checklist Incident Escalation
    • Checklist Closure of an Incident
    • Incident Report template

Roles | Responsibilities

Incident Manager - Process Owner

  • The Incident Manager is responsible for the effective implementation of the Incident Management process and carries out the corresponding reporting. He represents the first stage of escalation for Incidents, should these not be resolvable within the agreed Service Levels.

1st Level Support

  • The responsibility of 1st Level Support is to register and classify received Incidents and to undertake an immediate effort in order to restore a failed IT service as quickly as possible. If no ad-hoc solution can be achieved, 1st Level Support will transfer the Incident to expert technical support groups (2nd Level Support). 1st Level Support also keeps users informed about their Incidents' status at agreed intervals.

2nd Level Support

  • 2nd Level Support takes over Incidents which cannot be solved immediately with the means of 1st Level Support. If necessary, it will request external support, e.g. from software or hardware manufacturers. The aim is to restore a failed IT service as quickly as possible. If no solution can be found, the 2nd Level Support passes on the Incident to Problem Management.

3rd Level Support

  • 3rd Level Support is typically located at hardware or software manufacturers (third-party suppliers). Its services are requested by 2nd Level Support if required for solving an Incident. The aim is to restore a failed IT Service as quickly as possible.

Major Incident Team

  • A dynamically established team of IT managers and technical experts, usually under the leadership of the Incident Manager, formulated to concentrate on the resolution of a Major Incident.
Responsibility Matrix: ITIL Incident Management
ITIL Role / Sub-Process [Details]Incident Manager1st Level Support2nd Level SupportMajor Incident TeamApplications Analyst[3]Technical Analyst[3]IT Operator[3]
Incident Management SupportA[1]R[2]------
Incident Logging and CategorizationAR-----
Immediate Incident Resolution by 1st Level SupportAR-----
Incident Resolution by 2nd Level SupportA-R-R[4]R[4]R[4]
Handling of Major IncidentsARR-R--R
Incident Monitoring and EscalationARR-----
Incident Closure and EvaluationAR-----
Pro-Active User InformationAR-----
Incident Management ReportingAR------

Remarks

[1] A: Accountable according to the RACI Model: Those who are ultimately accountable for the correct and thorough completion of the Incident Management process.

[2] R: Responsible according to the RACI Model: Those who do the work to achieve a task within Incident Management.

[3] see → Role descriptions...

[4] In cooperation, as required. 2nd Level Support Groups often include Applications Analysts and/ or Technical Analysts.

Example

TheintroductoryITIL Process Map video shows samples of the ITIL process templates with contents from Service Operation and Incident Management processes, including the

  • high-level view of the ITIL Service Lifecycle (Level 0)
  • overview of the Service Operation process (Level 1)
  • overview of ITIL Incident Management (Level 2)
  • detailed process flow for the process "Incident Management: Incident Resolution by 1st Level Support" (Level 3)

Watch the video: "The ITIL Process Map - Introduction" (10:58 min.)

Notes

By:Stefan KempterIncident Management | IT Process Wiki (4), IT Process Maps.

ITIL 4 Incident ManagementProcess DescriptionSub-ProcessesDefinitions

Incident Management | IT Process Wiki (2024)

FAQs

What is incident management for IT operations? ›

IT incident management process. An incident management process helps IT teams investigate, record, and resolve service interruptions or outages. The ITIL incident management workflow aims to reduce downtime and minimize impact on employee productivity from incidents.

What are the 5 stages of the incident management process? ›

There are five steps in an incident management plan:
  • Incident identification.
  • Incident categorization.
  • Incident prioritization.
  • Incident response.
  • Incident closure.

What is the IT incident management lifecycle? ›

The NIST incident response lifecycle breaks incident response down into four main phases: Preparation; Detection and Analysis; Containment, Eradication, and Recovery; and Post-Event Activity.

What is incident management in ITIL? ›

ITIL incident management (IM) is the practice of restoring services as quickly as possible after an incident. And it's a main component of ITIL service support. ITIL incident management is a reactive process. You can use IM to diagnose and escalate procedures to restore service. So, it's not a proactive measure.

What is MIM in ITIL? ›

If your organization has a major incident management (MIM) process in place, you can swiftly respond to and resolve major incidents. If you don't have such a process in place, it's time to draw up an emergency response plan, also known as a major incident response process.

What is an example of an IT incident? ›

Hardware incidents typically include downed or limited resources, network issues or other system outages. Security incidents encompass attempted and active threats intended to compromise or breach data. Unauthorized access to personally identifiable records is a security issue, for example.

What is an incident in IT service? ›

IT incident refers to an unexpected event that disrupts business operational processes or reduces the quality of a service.

What is the difference between IT service management and incident management? ›

An IT service desk is responsible for handling service requests. A service request can be a straightforward ticket for information, access, or approval. There can be a predefined set of neatly categorized offerings for the users. As the name suggests, an incident management team deals with incidents.

What are the 5 C's of incident management? ›

The 5C Model Explained
  • Comprehend. In the first stage of the 5C model, it is essential to comprehend the nature and scope of the crisis. ...
  • Coordinate. Coordination is crucial during a crisis, as it ensures a unified and consistent approach to communication. ...
  • Collaborate. ...
  • Communicate. ...
  • Confirm.
Jun 16, 2023

How can I be a good incident manager? ›

Managing and tracking various aspects of emergency response is helpful for an incident manager. This skill includes knowing each team member's responsibilities and how they contribute to the success of the response. It also helps you understand the resources you and your team require to fulfill those responsibilities.

What is the IT service lifecycle? ›

The ITIL Service Lifecycle is a structured and organized system defined in ITIL v3 and designed to manage a product or service throughout its lifecycle. It is divided into five phases, each one with their own specific processes: strategy, design, transition, operation, and continual improvement.

What are the phases of ITIL incident management? ›

ITIL incident management process flow
  • Incident logging. ...
  • Incident categorization and prioritization. ...
  • Initial diagnosis. ...
  • Functional and hierarchic escalation. ...
  • Investigation and diagnoses. ...
  • Resolution and recovery. ...
  • Incident closure.

What is the incident response plan for IT company? ›

An incident response plan is a set of written instructions that outline your organization's response to data breaches, data leaks, cyber attacks and security incidents.

What are the 4 stages of major incident management? ›

Most major incidents can be considered to have four stages: • the initial response; the consolidation phase; • the recovery phase; and • the restoration of normality.

What are the 5 key areas of incident management? ›

Manage Incidents Throughout their Lifecycle

According to the NIST framework, the cybersecurity lifecycle includes five areas: identification, protection, detection, response and recovery.

What is the incident management process in cyber security? ›

The security incident management process typically starts with an alert that an incident has occurred and engagement of the incident response team. From there, incident responders will investigate and analyze the incident to determine its scope, assess damages, and develop a plan for mitigation.

What are the 4 stages of incident management? ›

The following are the different phases of incident response for a security incident, according to the National Institute of Standards and Technology (NIST).
  • Preparation. ...
  • Detecting and analysis. ...
  • Containment, eradication, and recovery. ...
  • Review (post incident analysis)

Top Articles
Latest Posts
Article information

Author: Duncan Muller

Last Updated:

Views: 6035

Rating: 4.9 / 5 (79 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Duncan Muller

Birthday: 1997-01-13

Address: Apt. 505 914 Phillip Crossroad, O'Konborough, NV 62411

Phone: +8555305800947

Job: Construction Agent

Hobby: Shopping, Table tennis, Snowboarding, Rafting, Motor sports, Homebrewing, Taxidermy

Introduction: My name is Duncan Muller, I am a enchanting, good, gentle, modern, tasty, nice, elegant person who loves writing and wants to share my knowledge and understanding with you.