What is problem management? A guide | Atlassian (2024)

Problem management is the process of identifying and managing the causes of incidents on an IT service. It is a core component of ITSM frameworks.

The closer you get to real incident experts, the less you actually hear the question: “What caused the incident?” Sure, you’ll hear it plenty from executives, and customers, and the press. But the experts know better.

Because the answer to “what caused the incident” is often dry and non-helpful: a rewritten config file, a corrupted database entry.

But what were the contributing causes behind the thing that caused the incident? What were the factors that led up to the incident? How is it possible that a config file could be rewritten? What conditions create a corrupted database entry? These are the questions you hear experts ask. And they’re at the heart of problem management.

Problem management isn’t just about finding and fixing incidents, but identifying and understanding the underlying causes of an incident as well as identifying the best method to eliminate that root cause. Moreover, pinpointing the cause has no value to an organization if it’s a cut-off process completed by a siloed team, so problem management should be constant and widely practiced across multiple teams, including IT, security, and software developers. An incident may be over once the service is up and running again, but until the underlying causes and contributing factors are addressed, the problem remains.

The relationship between problem management and other key ITIL processes

Problem management works alongside incident management and other ITIL practices to form an overall ITSM strategy.

Problem management vs. incident management

ITIL defines a problem as a cause, or potential cause, of one or more incidents. The behaviors behind effective incident management and effective problem management are often similar and overlapping, but there are still key differences. For example, rolling back a recently deploy may get the service operating again and end the incident, but the underlying problem remains.

That said, we believe that problem management and incident management practices are becoming increasingly intertwined. During the times between incidents, IT teams can focus their efforts on problem investigations that lead to improvements and better service quality. This is how problem management becomes the most valuable to the organization.

Problem management and change management

Change management is the process of planning, tracking, and releasing changes without service disruption or downtime.

When a change does cause disruption or downtime, that change is analyzed during incident and problem management processes.

Problem management and knowledge management

Knowledge management creates a repository of solutions and documentation for common procedures and even incident workarounds. When used together, a healthy knowledge management practice can enable faster incident resolution and fewer incidents altogether.

Problem management and service request management

Service request management is the practice of processing a request from a user for something to be provided, such as access to applications, software enhancements, and information. It can sometimes be difficult to distinguish a service request from an incident. In fact, the two were not distinguished and both lumped into the category “incidents” until the release of ITIL V3 in 2007. ITIL now defines an incident as ‘an unplanned interruption to an IT service or reduction in the quality of an IT service.’ It defines a service request as “a formal request from a user for something to be provided – for example, a request for information or advice; to reset a password; or to install a workstation for a new user.”

What are the benefits of problem management?

Done right, problem management unleashes many benefits for the business.

Decrease time to resolution

Teams that unlock the problems behind today’s incident will be better prepared to attack incidents in the future. By codifying best practices around problem analysis, teams will be able to more quickly respond and take action during the next service disruption.

Avoid costly incidents

Avoiding incidents will save time, money, and lots of pain. According to Gartner, many organizations report downtime costing more than $300,000 per hour. For some web-based services, that number can be dramatically higher.

Increase productivity

Stop responding to incidents so frequently and returnresources and time to teams who could be shipping new value to customers.

Empower your team to find and learn from underlying causes

When organizations effectively practice problem management, teams continually investigate, learn from incidents, and ship valuable updates. Unfortunately, many enterprises create a siloed problem management team that is too far removed from day-to-day operations to eliminate the most pressing problems.

Promote continuous service improvement

Problem management prevents incidents and also delivers value. For instance, fixing an incident causing low level performance also ships valuable service quality improvements.

Increase customer satisfaction

Better problem management leads to fewer incidents, and happier customers. Alternatively, customer patience wears thin when they notice the same incident happening multiple times. Decreasing the occurrence of repeat incidents builds customer trust.

The problem management process

At Atlassian, we advocate bringing the problem and incident management processes closer together.

When problem management is a heavy, siloed, and separate process, companies can end up creating a dumping ground of problems. This backlog is where problem issues go to die in some teams. It’s best to get problems in front of the teams that can handle and do valuable investigations.

That all being said, it’s good to understand the main steps that contribute to a problem management process. Such as:

  1. Problem detection- Proactively find problems so they can be fixed, or identify workarounds before future incidents happen.
  2. Categorization and prioritization - Track and assess known problems to keep teams organized and working on the most relevant and high-value problems.
  3. Investigation and diagnosis - Identify the underlying contributing causes of the problem and the best course of action for remediation.
  4. Create a known error record-In ITIL, a known error is “a problem that has a documented root cause and a workaround.” Recording this information leads to less downtime if the problem triggers an incident. This is typically stored in a document called a known error database.
  5. Create a workaround, if necessary - A workaround is a temporary solution for reducing the impact of problems and keeping them from becoming incidents. These aren’t ideal, but they can limit business impact and avoid a customer-facing incident if the problem can’t be easily identified and eliminated.
  6. Resolve and close the problem - A closed problem is one that has been eliminated and can no longer cause another incident.

Problem management best practices and tips

Like we mentioned earlier, the most effective problem management teams we’ve seen blend problem management and incident management.

Setting problem management as a separate practice creates a challenge where the problem team becomes a bottleneck or focuses on the wrong things, like problems from external vendors that they have no control over. Root causes are often not investigated until long after the incident has happened.

In many cases, your team may benefit from integrating incident management and problem management practices. This is a proactive approach that allows you to understand what led to the incident at the same time you work to resolve it. For example, resolving an incident in software requires identifying poor code (the cause), and then developing replacement code to avoid further incidents (the fix).

Weaving problem and incident togethermeanswhen teams aren’t in response mode they can look to problems that are most impacting service and performance quality and get ahead, to prevent future incidents.

Problem management tips

Avoid relying on reactive, root-cause analysis

There is rarely just one root cause behind an incident or problem. The best teams holistically consider all potential contributing factors and practice blameless analysis.

Encourage an open environment where problems are shared

Problem and incident analysis should be an open conversation where team members are encouraged to share the facts without fear of punishment or retribution.

Focus on critical services

Prioritize addressing the problems affecting the services that deliver the most value to the organization.

Ask questions and use the ‘5 whys’

Many teams find success using the “5 Whys” technique Taiichi Ohno, the architect of the Toyota Production System. Check out the Atlassian Team Playbook play to learn more.

Spread the knowledge

Open teams share knowledge and insights that their colleagues and adjacent teams can learn from.

Become a learning organization

Effective problem management isn’t something with an end date. Even the best-performing organizations have incidents. The truly world class teams are the ones who constantly iterate on their process, improve it, and lessen the impact of problems on their colleagues and customers.

Track follow-up

It’s important to develop a clear and standardized way to stay on top of follow-up actions. Since you should always be practicing problem management, it’s important to use ITSM software that will enable your team to prioritize tasks, track progress, and help associate incident issues with problems.

Summary

Incidents are often described as an unplanned investment in the future reliability of your service. An effective problem management delivers valuable service improvements, while identifying and eliminating the driving forces behind incidents.

Want to learn about problem management in Jira Service Management?

Get the guide

What is problem management? A guide | Atlassian (2024)

FAQs

What is problem management? A guide | Atlassian? ›

Problem management is the process of identifying and managing the causes of incidents on an IT service. It is a core component of ITSM frameworks.

What is the primary purpose of problem management? ›

The primary goal of Problem Management is to minimize the impact of Problems on the business and prevent recurrence. When successful, downtime and disruptions are reduced.

What is the key step in problem management? ›

The first step in problem management is to identify the problems that need to be addressed. You can use various sources of information, such as incident reports, feedback, audits, or monitoring tools, to find out what are the common or critical issues that affect your stakeholders or customers.

What is the meaning of managing problems? ›

Problem Management has the task of systematically identifying and analyzing problems in a company and developing sustainable solutions. It aims to identify the underlying causes of problems and take action to minimize recurring faults and prevent service outages.

What is problem management objective? ›

The objectives of the Problem Management process are to: • Prevent problems and resulting incidents from happening. • Eliminate recurring incidents. • Minimize the impact of incidents that cannot be prevented.

What is problem management with example? ›

Problem Management is the process to identify, prioritize, and systematically resolve these underlying issues. It provides the end-to-end management of problems from identification to elimination. A simple example – a flat tire. Everyone wants their tire fixed quickly so they can get back on the road.

What are the 2 types of problem management? ›

Reactive problem management is concerned with solving problems in response to one or more incidents. Proactive problem management is concerned with identifying and solving problems and known errors before further incidents related to them can occur again.

What are the 4 P's of problem management? ›

The Four P's to Problem Solving. Prep, Plan, Perform & Perfect | by Mat Helme | Medium.

Which task is a problem management responsibility? ›

Problem Manager Tasks

Problem Managers research the root causes of incidents, make temporary solutions (workarounds) available, and develop final solutions for known errors. Problem Managers engage in proactive problem management by analyzing trends or historical data of incidents and services.

What are the three phases of problem management? ›

It involves three phases such as problem identification, problem control and error control.

How do you handle management problem? ›

Here are some steps you can follow to solve management problems when you're not sure what to do.
  1. 1 Define the problem. The first step is to define the problem clearly and accurately. ...
  2. 2 Generate possible solutions. ...
  3. 3 Evaluate and select the best solution. ...
  4. 4 Implement and monitor the solution. ...
  5. 5 Here's what else to consider.
Aug 14, 2023

How do you manage and solve problems? ›

8 steps to problem solving
  1. Define the problem. What exactly is going on? ...
  2. Set some goals. ...
  3. Brainstorm possible solutions. ...
  4. Rule out any obvious poor options. ...
  5. Examine the consequences. ...
  6. Identify the best solutions. ...
  7. Put your solutions into practice. ...
  8. How did it go?

What is not a goal of problem management? ›

Final answer: In IT Service Management, the aim of problem management is not to restore service to a user. This is typically handled by Incident Management. Problem Management's objective is to prevent, manage, and eliminate problems and their resulting incidents.

What is a known error in problem management? ›

Create a known error record - In ITIL, a known error is “a problem that has a documented root cause and a workaround.” Recording this information leads to less downtime if the problem triggers an incident. This is typically stored in a document called a known error database.

What is the problem management process workflow? ›

The Problem Management process includes the activities required to identify and classify problems, to diagnose the root cause of incidents, and to determine the resolution to related problems.

What are the 5 basic steps in problem-solving? ›

Identify, analyze, resolve, execute, evaluate
  • Step 1: Identify. Identifying the problem may be simple, or it could be a detailed cognitive process that breaks the issue into manageable components. ...
  • Step 2: Analyze. Consider underlying factors and devise strategies. ...
  • Step 3: Resolve. ...
  • Step 4: Execute. ...
  • Step 5: Evaluate.
Aug 31, 2023

What is the first key step in solving any problem? ›

1. Define the problem. Diagnose the situation so that your focus is on the problem, not just its symptoms. Helpful problem-solving techniques include using flowcharts to identify the expected steps of a process and cause-and-effect diagrams to define and analyze root causes.

What are the 4 important steps in problem-solving method? ›

Analyze—Understand the root cause. Plan—Determine how to resolve the problem. Implement—Put the resolution in place. Evaluate—Determine if the resolution is producing the desired results.

What are the 4 step problem-solving process? ›

The 4-step Problem Solving Method
  • Develop a Problem Statement.
  • Determine Root Causes.
  • Rank Root Causes in Order of Importance.
  • Create an Action Plan.
Jun 7, 2017

Top Articles
Latest Posts
Article information

Author: Melvina Ondricka

Last Updated:

Views: 5407

Rating: 4.8 / 5 (68 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Melvina Ondricka

Birthday: 2000-12-23

Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

Phone: +636383657021

Job: Dynamic Government Specialist

Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.