Category: Root Cause Analysis Tips
Photo of meteor from Chelyabinsk, Russia in 2013
If confirmed, here is a link to the first recorded fatality due to a meteorite strike in modern history. This would be one of the few appropriate uses of the Natural Disaster category on the Root Cause Tree®.
When doing a root cause analysis using TapRooT®, one of the top-level paths you can follow can lead you to Natural Disaster as a possibility. We note that this doesn’t come up very often. When you go down this path, TapRooT® makes you verify that the problem was caused by a natural event that was outside of your control.
I have seen people try to select Natural Disaster because there was a rainstorm, and a leak in the roof caused damage to equipment inside the building. Using TapRooT®, this would most likely NOT meet the TapRooT® Dictionary® definition of Natural Disaster. In this case, we would want to look at why the roof leaked. There should have been multiple safeguards in place to prevent this. We might find that:
The roofing material was improperly installed.
We do not do any inspections of our roof.
We have noted minor water damage before, but did not take action.
We have deferred maintenance on the roof due to budget, etc.
Therefore, the leaky roof would not be Natural Disaster, but a Human Performance issue.
The case of the meteorite strike, however, is a different issue. There are no reasonable mitigations that an organization can put in place that would prevent injury due to a meteorite. This is just one of those times that you verify that your emergency response was appropriate (Did we call the correct people? Did medical aid arrive as expected?). If we find no issues with our response, we can conclude that this was a Natural Disaster, and there are no root causes that could have prevented or mitigated the accident.
Many industries have dropped into a recession or a downright depression.
Oil, coal, iron ore, gas, and many other commodity prices are at near term (or all time) lows.
When the economy goes bad, the natural tendency is for companies to cut costs (and lay people off). Of course, we’ve seen this in many industries and the repercussion have been felt around the world.
Since many of our clients are in the effected industries, we think about how we could help.
If you could use some help … read on!
I think the first way we can help is to remind TapRooT® Users and management at companies that use TapRooT® that in hard times, it is easy for employees to hear they wrong message.
What is the wrong message?
Workers and supervisors think that because of the tough economic times, they need to cut corners to save money. Therefore, they shortcut safety requirements.
- A mechanic might save time by not locking out a piece of equipment while making an adjustment.
- An operator might take shortcuts when using a procedure to save time.
- Pre-job hazard analyses or pre-job brief might be skipped to save time.
- Facility management might cut operating staff or maintenance personnel below the level needed to operate and maintain a facility safely.
- Supervisors may have to use excessive overtime to make up for short staffing after layoffs.
- Maintenance may be delayed way past the point of being safe because funds weren’t available.
These changes might seem OK at first. When shortcuts are taken and no immediate problems are seen, the decision to take the shortcut seems justified. This starts a culture shift. More shortcuts are deemed acceptable.
In facilities that have multiple Safeguards (often true in the oil, mining, and other industries that ascribe to process safety management), the failure of a single Safeguard or even multiple Safeguards may go unnoticed because there is still one Safeguard left that is preventing a disaster. But every Safeguard has weaknesses and when the final Safeguard fails … BOOM!
This phenomenon of shortcuts becoming normal has a PhD term … Normalization of Deviation.
The result of normalization of deviation? Usually a major accident that causes extensive damage, kills multiple people, and ruins a company’s reputation.
So, the first thing that we at System Improvements can do to help you through tough times is to say …
This could be happening to your operators, your mechanics, or your local management and supervision.
When times are bad you MUST double up on safety audits and management walk arounds to make sure that supervisors and workers know that bad times are not the time to take shortcuts. Certain costs can’t be cut. There are requirements that can’t be eliminated because times are tough and the economy is bad.
When times are tough you need the very BEST performance just to get by.
When times are tough, you need to make sure that your incident investigation programs and trending are catching problems and keeping performance at the highest levels to assure that major accidents don’t happen.
Your incident investigation system and your audit programs should produce KPI’s (key performance indicators) that help management see if the problems mentioned above are happening (or are being prevented).
If you aren’t positive if your systems are working 100%, give us a call (865-539-2139) and we would be happy to discuss your concerns and provide ideas to get your site back on the right track. For industries that are in tough times, we will even provide a free assessment to help you decide if you need to request additional resources before something bad happens.
Believe me, you don’t want a major accident to be your wake up call that your cost cutting gone too far.
How would you like to save time and effort and still have effective root cause analysis of small problems (to prevent big problems from happening)?
For years I’ve had users request “TapRooT®-Lite” for less severe incidents and near-misses. I’ve tried to help people by explaining what needed to be done but we didn’t have explicit instructions.
Last summer I started working on a new book about using TapRooT® to find the root causes of low-to-medium risk incidents. And the book is now finished and back from the printers.
- The book is only 50 pages long.
- It makes using TapRooT® easy.
- It provides the tools needed to produce excellent quality investigations with the minimum effort.
- It will become the basis for our 2-Day TapRooT® Root Cause Analysis Course.
When can you get the book? NOW! Our IT guys have a NEW LINK to the new book on our store.
By April, we should have our 2-Day TapRooT® Course modified and everything should be interlinked with our new TapRooT® Version VI Software.
In hard economic times, getting a boost in productivity and effectiveness in a mission critical activity (like root cause analysis) is a great helping hand for our clients.
The new book is the first of eight new books that we will be publishing this year. Watch for our new releases and take advantage of the latest improvements in root cause analysis to help your facility improve safety, quality, and efficiency even when your industry is in tough economic times. For more information on the first of the new books, see:
If you need help, give us a call. (865-539-2139)
Are you having a backlog of investigations because of staff cuts? We can get you someone to help perform investigations on a short term basis.
Need to get people trained to investigate low-to-medium risk incidents effectively (and quickly)? We can quote a new 2-Day TapRooT® Root Cause Analysis Course t to be held at your site.
Need a job because of downsizing at your company? Watch the postings at the Root Cause Analysis Blog. We pass along job notices that require TapRooT® Root Cause Analysis skills.
This isn’t the first time that commodity prices have plummeted. Do you remember the bad times in the oil patch back in 1998? We helped our clients then and we stand by to help you today! We can’t afford to stop improvement efforts! Nobody wants to see people die to maintain a profit margin or a stock’s price. Let’s keep things going and avoid major accidents while we wait for the next economic boom.
“You get what you ask for,” ever hear that phrase? Well, it is a good lead into root cause tip #1.
#1 Know why you are doing the root cause analysis but DON’T let the reason drive the root cause process and findings itself.
The quality of a root cause analysis report, or in many cases the amount of information contained in the report, is driven by the requirement for the root cause analysis itself.
- Government Agency Requirement
- Regulatory Finding Requirement
- Internal Company CEO/CFO Requirement
- Internal Company Policy Requirement
- Supervision Request but no policy requirement
Which one of the requirements above most likely requires a more extensive root cause analysis report, written in a very specific way? Most of us, by experience, would focus on items A-C. Besides the extensive amount of time it takes to produce the regulatory report, how could the report requirement become a driver for poor root cause analysis?
- Report writing drives the actual evidence collection.
- Terminology required in the report forces people to prioritize one problem over another, and in some cases ignore important information because it does not have a place in the report.
- Information is not included or addressed because the report is going to an outside organization.
If A-C root cause analysis requirements could lead to biased or incomplete root cause analyses because of the extensive regulatory requirements, then D-E should be better right? Well, not so fast.
- Less oversight of the root cause analysis report (if there is one) could result in less validated evidence or a list of corrective actions with limited support to substantiate them.
- There is often a higher variability of how the root cause analysis is performed depending on who is performing it and where they are performing it.
So how do you counter the problems of standardization verses non-standardization issues in root cause analysis? The easiest method is to use a guided investigation process and not drive the process itself. Once the root cause analysis is complete, then and only then focus on writing the report.
Below is a list of 7 points with a link to read more if needed that can help reduce bias and variability. 7 Secrets of Root Cause Analysis
- Your root cause analysis is only as good as the info you collect.
- Your knowledge (or lack of it) can get in the way of a good root cause analysis.
- You have to understand what happened before you can understand why it happened.
- Interviews are NOT about asking questions.
- You can’t solve all human performance problems with discipline, training, and procedures.
- Often, people can’t see effective corrective actions even if they can find the root causes.
- All investigations do NOT need to be created equal (but some investigation steps can’t be skipped).
This is just plain project management advice. If the team and process owner of the issue being analyzed believe that you as the root cause facilitator own the root cause analysis, guess what… You Do! It’s your evidence, your root causes, your corrective actions and your accountability of success or failure. It is easier to pass the buck so to be speak and can also hamper the support that the facilitator needs to ensure an effective investigation.
In most cases the root cause analysis facilitator is just that, the facilitator of information. Keep it that way and establish ownership up front.
#3 As a team, define what finished means for the root cause analysis and if there is a turnover of the root cause analysis, ensure that ownership is maintained by the appropriate people.
Often the root cause analysis facilitators in my courses tell me that once the analysis portion is done at their company, the report is handed off to their supervision to make the actual corrective actions. Not optimal in itself, and should include a validation step handled by the root cause facilitator to ensure that the corrective actions match up to the original findings. The point, however, is that whatever “finished “ is, and wherever a true handoff of information must occur, it needs to be established up front along with the ownership discussed in tip #2.
In TapRooT® Root Cause Analysis, the following would be great investigation steps to focus on with your team and peers when discussing what finished means, hear more about these steps here.
- After Creating Summer SnapCharT® – Is the SnapCharT® thorough enough or do we need more interviews & data?
- After Defining Causal Factors – Are they at the right end of the cause-and-effect chain? Was a Safeguards Analysis conducted? Were all the failed safeguards identified as causal factors?
- After RCA and Generic Cause Analysis – Did they use their tools (Root Cause Tree®, Root Cause Tree® Dictionary, etc.)? Did they find good root causes? Did they find generic causes? Did they have evidence for each root cause?
- After Developing Corrective Actions – Use corrective action helper to determine effectiveness of corrective actions.
These 3 root cause tips were designed to reduce the barriers to good quality root cause analysis. Comment below if you have additional tips that you would like to pay forward.
Hello and welcome to this week’s root cause analysis tips column.
One of the questions I am asked often is “what should we investigate?”
The answer is it really depends on your company, your numbers, and your resources. I have some ideas, and these apply to anything, but I will use safety as an example.
First of all, your company may have a policy on what has to be investigated; for example, all lost time injuries or all recordable injuries. So you already know you are required to do those. But what if something is not required?
What I say is investigate as much as possible based on your numbers and your resources. If you work at a site that has 10 injuries a year but only 2 are recordable, if you have the resources to do all 10, I certainly would. It is likely the only difference between the 2 and the other 8 is……LUCK.
What if you have more than you can possibly investigate? Then you should do a really good job at categorization, and do investigations on the TRENDS. In other words, I would rather have you do one really good investigation on a trend than dozens of sub-standard investigations. You will use less resources but get better results.
How do you do an investigation on a trend? It is really very simple – instead of mapping out an incident with a SnapCharT®, you map out the process. You can leave the circle for the incident off the chart or you can make the circle the trend itself. The events timeline is simply the way the process flows from start to finish, and this is very easy to do if you understand the process. If you need help from the process owner, an SME, or employee, you can do that too. For conditions, you add everything you know about the process, as well as any data (evidence) available from the reports or other sources. You mark significant issues (the equivalent of causal factors) for things that you know have gone wrong in the past. You can take it a step further any also mark as significant issues things that COULD go wrong (think of this as potential causal factors). You then do your root cause analysis and corrective actions. This is not hard, it is just a different way of thinking.
Just a few more thoughts about what to investigate; basically, anything that is causing you pain. Process delays, customer complaints, downtime, etc. can all be investigated. But by all means, make sure it is worth your time and that there is really something to learn from it. Please don’t investigate paper cuts!
I hope my ideas give you some food for thought. Keep pushing the boulder up the hill and improving your business. Thanks for visiting the blog.
Sign up to receive tips like these in your inbox every Tuesday. Email Barb at firstname.lastname@example.org and ask her to subscribe you to the TapRooT® Friends & Experts eNewsletter – a great resource for refreshing your TapRooT® skills and career development.
While reading Sentinel Event Alert 55 (SEA-55) from TJC issued September 28, 2015 on Fall Prevention, it occurred to me that TapRooT® can be used to aid in finding the root causes of the fall. Even more importantly, TapRooT® can be used to aid in maintaining your fall prevention program to ensure long-term success. The TJC lists the following common contributing factors (in TapRooT® these would be called “Causal Factors“):
- Inadequate assessments
- Communication Failures
- Lack of adherence to protocols and safety practices
- Inadequate staff orientation, supervision, staffing levels and skill mix
- Deficiencies in the physical environment
- Lack of Leadership
While these are good guidelines for what to look for and what data to gather, to us these do not represent root causes. These 6 items almost match up with most of the 7 Basic Categories on the back of our Root Cause Tree®. So as TapRooT® investigators, know you have to dig a bit deeper to find the true causes and define those at the Root Cause level not at the causal or contributing level.
All this being said, the more important reason I wanted to write this article is to highlight the use of your TapRooT® tools by using them for Proactive measures. How to examine and improve your fall management program and maintain continued success. Too many times we don’t think about the power of observation and the idea of raising awareness through communication. Each of these can be highlighted through the Proactive Process Flow below:
In SEA-55, two of the actions suggested by TJC were to 1) Lead an effort to raise awareness of the need to prevent falls resulting in injury and 2) Use a standardized, validated tool to identify risk factors for falls. These two items can benefit from the TapRooT® tools directly.
Starting with step 1 above in the Proactive Flow, use the SnapCharT® tool to outline the steps in patient assessment, highlight the steps that can or will affect the fall prevention portion of patient care, then use this flow as the basis for an observation program. By getting out and observing actual performance in the field you can do two things, show your concern for patient safety (and falls in this case) and gather actual performance data. These observations can be performed both in a scheduled and/or random fashion and can be done in any setting (ambulatory, non-ambulatory, clinic et cetera).
During the observation, document findings on the SnapCharT® and identify potential “Significant Issues” as they apply to fall prevention. This data can then be either evaluated using the Root Cause Tree® to define the areas of need for that single observation, or the data can be combined with other fall prevention observation data for use in an aggregate analysis or common cause analysis. With the aggregate analysis data from multiple observations can be combined, and “Significant Issues” can be identified based on multiple observations before an analysis using the Root Cause Tree® is performed. This could give you an overall bigger picture view of your processes.
Once the RCA is performed (in either situation), Steps 5-7 can be simply followed to produce some recommended actions to be implemented and measured using Corrective Action Helper® and SMARTER. And the beauty of this Proactive process is that you have not waited for a fall to learn. You and your organization are preventing future issues before they manifest thus showing your patients and staff that you truly care about their safety.
If you would like to learn more about using your TapRooT® tools proactively you can contact me at Skompski@taproot.com for more information or you can attend any of our public seminars, 2-day or 5-day to learn more on both the reactive and proactive use of the TapRooT® tools!
A new investigator may believe that if an interviewee is telling the truth, he will be consistent in his recollection of an event every single time. However, not every inconsistent statement made by an interviewee is made to intentionally deceive.
In fact, most interviewees want to be helpful. Further, an inconsistent statement may be as accurate or even more accurate than consistent claims. That is, an account repeated three times with perfect consistency may be more of a red flag to dig deeper.
The two most important things to think about when evaluating inconsistencies are the passage of time between the incident and its recollection, and the significance of the event to the interviewee. Passage of time makes memory a bit foggy, and items stored in memory that become foggy the quickest are things that we don’t deem significant, like what we ate for lunch last Wednesday.
There are three ways to decrease the possibility of innocent inconsistent statements during the interviewing process.
- Encourage the interviewee to report events that come to mind that are not related or are trivial. In this way, you discourage an interviewee trying to please you by forcing the pieces to fit. They do not know about all the evidence that has been collected, and may believe that something is not related when it truly is.
- Tell the interviewee, explicitly, not to try to make-up anything he or she is unsure of simply to prove an answer. If they don’t know, simply request they say, “I don’t know.” This will help them relax.
- Do not give feedback after any statement like “good” or “right.” This will only encourage the interviewee to give more statements that you think are “good” or “right”– and may even influence them to believe that some things occurred that really didn’t.
We have plans to go over many more details on how to conduct a good interview at the 2016 Global TapRooT® Summit, August 1 – 5 in San Antonio, Texas. Save the date and look for updates here.
For many years now the TJC and other governing bodies have required root cause analysis (RCA) on Sentinel events as well as analyses on near misses with high potential. To remain accredited, organizations have put together teams to perform analyses to find the causes and to recommend, implement and track corrective actions. Throughout this time of focus and effort there continue to be repeat sentinel events. So the question that arises is, why are these RCA’s failing?
This question may appear very complex but the root of the problem is actually very simple. From reading many Event reports and examining how many organizations perform these analyses two things stand out to me:
- Many analyses stop at too high a level due to a lack of information and do not reach true root causes. They stop at what we define as a Causal of Contributing factor.
- Many corrective actions don’t address the root cause due to the limited analysis or because the corrective actions created are not specific to changing a particular behavior or system.
What truly makes this even simpler is the fact that these two issues are interrelated. If you don’t thoroughly gather the correct information and identify the true root causes the corrective actions may not be focused enough to fix the problem. We will then fall into the trap of implement general or employee focused corrective actions that don’t address system problems. This can result in wasted time and resources and can have a very negative impact on the people involved in the event.
Here you see an example where the investigator stopped gathering data at a Causal or Contributing Factor.
In this example there was a mistake made by the nurse when retrieving a medication for a particular patient. With no additional information gathered, the investigator is forced to stop at this level. No more analysis can be performed without many assumptions and opinions being used. In this case, when the team moves to corrective actions, how do you fix someone retrieving the wrong medication? Well, without any additional information, we counsel the employee to be more careful, we punish the nurse for making the wrong choice, and/or we retrain everyone to make sure there is an overall understanding of this issue. None of these truly change the system and address the causes of the issue (as you will see below).
If the investigator gathers much more information on the issue there is at least a chance to more thoroughly examine the issue using your RCA tools and dig deeper to a root cause level.
Having this additional data available allows the investigator to dig deeper into the issue to identify the underlying system root causes that contribute to this mistake by the nurse. This changes the focus to the organizational systems and not solely on the individual. Knowing that it has become common practice during high census to not follow the second check rule (or 5 Rights) and there have been no negative consequences consistently provided by management for this issue we would be able to identify system related causes such as Management System ->SPAC Not Used ->Enforcement NI (from the Root Cause Tree®) and other causes. By getting to this level of analysis and understanding the system cause(s), we can now build corrective actions to address specific system issues. By addressing the specific causes and in this case changing the rules or terms around times with a higher than normal census, the requirements for following and consequences for not following this policy we are changing the systems in the organization. By changing the systems we can enact long lasting positive change in the organization and build sustained success and change the behaviors of our employees.
Can You Use One Root Cause Analysis Tool for Quality, Safety, Production, IT, Cost, and Maintenance Issues?December 2nd, 2015 by Chris Vallee
The disagreement of which root cause analysis tools are used by who actually starts with the creation of internal company functional silos. Companies that run smoothly almost transparently as one unit realize that Quality, Safety, Production, IT, Cost, and Maintenance Departments impact each other, either positively or negatively, and should use similar tools during root cause analysis to enhance root cause analysis communication between departments. Unfortunately, this unison is not often common. Let’s break it down a little.
IT (Information Technology) – often focuses on rapid root cause diagnosis and analysis.
Quality – tends to focus on the 8 Basic Quality Tools and Lean Activities with different variations in the sequence of root cause analysis.
Safety – focuses on root cause analysis tied to hazard and risk to reduce Health, Safety and Environment Issues.
Production – is probably the closest tied to quality and cost reduction issues, whereas safety is more often viewed as cost aversion. The problem solving tools utilized here are often tied to the Quality and Cost root cause analysis tools to ensure production is met and the company makes money.
Maintenance – is focused on operational efficiencies and cost to run and maintain the equipment. Often tied to Quality and Production root cause analysis tools but more tied to equipment specifics.
Cost – everybody needs to know where the money goes and if we have enough to keep the business alive. Financial knower’s in the company get tasked by many of the departments listed above, some departments more than others. Their root cause analysis tools are more tied to transactional processes.
Now that the different company functions listed above are established, what often happens next is that the department leaders search for root cause analysis tools created just for their types of problems and the silo walls between departments get even bigger. Why? Simple, the specific function tools often only look for issues and causes tied to their specific issues. So what’s wrong with that you ask?
Input – Process – Output across your Company’s Work Processes
What each functional department changes or produces impacts another department either upstream or downstream from that department. Root cause analysis tools that are too functionally specific tend to not explore or encourage multi-department discussions during root cause analysis. If the tools don’t talk your language, they do not apply to you or in some cases, the company does not think you need to be trained in the other tools.
Case in point, as a lean six sigma black belt in a previous company, I spent my last year in manufacturing mentoring our safety department in quality tools. No one from safety had ever attended our quality training that we taught internally, even though we taught classes every month. Operations and Maintenance employees attended the training because they were more tied to the return on investment company costs.
Break the silo department barriers, look for a root cause analysis process that can tie Quality, Safety, Production, IT, Cost, and Maintenance Department issues together to help solve problems as one.
For over 28 years, System Improvements has prided itself in having a standard root cause analysis process called TapRooT®. No matter what the problem being analyzed they all start with Defining the Worst Consequence that Occurred, Identifying What Happened and How It Happened and then Why. We also teach and include Corrective Actions that are global industry best practices.
Don’t fret, because we don’t recommend that you throw away your other data collection and analysis tools, instead we recommend that you use the TapRooT® Root Cause Analysis Method as the standard communication and investigation tool for the root cause process to enhance and consolidate current programs for one company vision. After all, everything has a timeline of events or a sequence of transactions, start your problem solving with a proven root cause analysis process that starts there first and then helps guide employees through multiple types of problems to help you understand Human Performance
Once you’ve gathered all the information you need for a TapRooT® investigation, you’re ready to start with the actual root cause analysis. However, it would be cumbersome to analyze the whole incident at once (like most systems expect you to do). Therefore, we break our investigation information into logical groups of information, called Causal Factor groups. So the first step here is to find Causal Factors.
Remember, a Causal Factor is nothing more than a mistake or an equipment failure that, if corrected, could have prevented the incident from happening (or at least made it less severe). So we’re looking for these mistakes or failures on our SnapCharT®. They often pop right off the page at you, but sometimes you need to look a little harder. One way to make Causal Factor identification easier is to think of these mistakes as failed or inappropriately applied Safeguards. Therefore, we can use a Safeguard Analysis to identify our Causal Factors.
There are just a few steps required to do this:
First, identify your Hazards, your Targets, and any Safeguards that were there, or should have been there.
Now, look for:
- an error that allowed a Hazard that shouldn’t have been there, or was larger than it should have been;
- an error that allowed a Safeguard to be missing;
- an error that allowed a Safeguard to fail;
- an error that allowed the Target to get too close to a Hazard; or
- an error that allowed the Incident to become worse after it occurred.
These errors are most likely your Causal Factors.
Let’s look at an example. It’s actually not a full Incident, but a VERY near miss. This video is a little scary!
Let’s say we’ve collected all of our evidence, and the following SnapCharT is what we’ve found. NOTE: THIS IS NOT A REAL INVESTIGATION! I’m sure there is a LOT more info that I would normally gather, but let’s use this as an example on how to find Causal Factors. We’ll assume this is all the information we need here.
Now, we can identify the Hazards, Targets, and Safeguards:
|Pedestrians (they could have stayed off the tracks)|
Using the error questions above, we can see that:
- An error allowed the Hazard to be too large (the train was speeding)
- An error allowed the Targets to get too close to the Hazard (the Pedestrians decided to go through the fence, putting them almost in contact with the Hazard)
These 2 errors are our Causal Factors, and would be identified like this:
We can now move on to our root cause analysis to understand the human performance factors that lead to this nearly tragic Incident.
Causal Factors are an important tool that allow TapRooT® to quickly and accurately identify root causes to Incidents. Using Safeguard Analysis can make finding Causal Factors much simpler.
Sign up to receive tips like these in your inbox every Tuesday. Email Barb at email@example.com and ask her to subscribe you to the TapRooT® Friends & Experts eNewsletter – a great resource for refreshing your TapRooT® skills and career development.
Here’s scenario #1:
An incident occurs.
The supervisor performs a 5-Whys analysis, or maybe just does a few interviews with a few employees out on the plant floor. The supervisor collects just enough information to fill out the company report, or to satisfy his manager because this is a task done in his spare time. Once someone or something is found to pin the cause on, the supervisor thinks of a solution, (typically an employee gets disciplined or a piece of equipment gets fixed), and the root cause analysis is complete.
The downside to doing root cause analysis in your spare time like this is you’ll probably see repeat incidents. You’ll miss root causes or not get to the root. So, instead of saving time doing the investigation in your spare time, you have created more work. Plus, you are working within your own knowledge. You may be very experienced, but a bias (and we all have them) can cause you to overlook important information. Also, morale will be affected because employees do not want to live under the fear of punishment if they make a mistake. And let’s not forget when near misses and small problems aren’t solved, chances are a major incident is building on the horizon. Don’t let your facility be the next headline!
Here’s scenario #2:
An incident occurs.
The supervisor performs a TapRooT® investigation in his or her spare time. Her company does not have a blame culture– hooray! She only had time to attend one day of a 2-day TapRooT® course, but the former supervisor showed her the basics. The supervisor uses the Root Cause Tree® as a “pick list,” (without using a Root Cause Tree® Dictionary to dig deeper – she is not even aware there is a dictionary), until one root cause and a couple of causal factors are found. Sigh of relief. Corrective actions to the root cause are implemented. Check! This root cause analysis is complete!
The downside to this TapRooT® “spare time” root cause analysis is similar to scenario #1 in that you will probably experience repeat incidents because you’ll miss root causes that won’t be fixed, and there was not sufficient training on the TapRooT® tools. You may progress beyond your own knowledge in identifying root causes using the Root Cause Tree® and that’s a plus, but you may not be casting a wide enough net by using all of the tools in the TapRooT® system. Take shortcuts and don’t use all the tools available to you, and you will lose the power of TapRooT® to effectively guide you in your root cause analysis to find and fix incidents.
Don’t be that supervisor!
To get the full benefit of TapRooT®, join us at a course to receive all of these tools and understand how to use them:
SnapChart® – a visual technique for collecting and organizing information to understand what happened.
Root Cause Tree® – a way to see beyond your current knowledge (with additional help from the Root Cause Tree® Dictionary)
Corrective Action Helper® – a tool to help you think “outside the box” to develop effective corrective actions.
Safeguard Analysis – identify and confirm causal factors
This is how you find all the root causes and fix them once and for all. Smaller problems are also found before they turn into major disasters. It’s a win for everyone!
Are you doing spare time root cause analysis? There is still time to join us for a course in 2015 and make 2016 a different story.
All Root Cause Analyses started have an initial goal…
Reduce, Mitigate or Eliminate a Problem!
As TapRooT® trained root cause analysis investigators soon learn, there is usually more than one Causal Factor that caused the Incident being investigated, and each Causal Factor has more than one Root Cause. If this sounds foreign to you as an investigator, check out our TapRooT® Root Cause method here. So if problems do not occur in isolation, why should the investigator work in isolation? Thus, the topic of today, “Peer Feedback to Improve Root Cause Analysis.”
Previously we discussed real–time peer review during the investigation TapRooT® Process and reviewing a completed TapRooT® looking for the “Good, Bad and the Ugly” with a spreadsheet audit included.
Root Cause Analysis Video Tip: Conduct Real-Time Peer Reviews
The Good, The Bad & The Ugly
So what’s next? Are peers created equal? What value can a peer add? What value does the peer get from giving feedback about a root cause analysis? Let’s see….
Peers are not created equal! This is a good thing. Below is a short list of peers to get feedback on your root cause analysis progression and the value that they add.
1. Coach/Mentor: This is a person who is competent and formally trained in the root cause process that you are using. They are not teaching you the process but guiding you through your use of it after you were formally trained. They have been in the trenches and dealt with the big investigations, These process champions can easily get you back on the right track and show unique techniques.
Too many companies get large numbers of employees trained in a process and then let them run free without future guidance or root cause analysis feedback. This is why our TapRooT® Instructors are available for process questions after training is complete. This is also why we encourage key company employees to attend our 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training to help mentor those that have taken our 2-Day TapRooT® Incident Investigation and Root Cause Analysis Course.
2. Equal: This is the person who has attended the same root cause analysis training that you have and has the same level of competence. They may also have the same industry technical experiences that you have.
The value of the feedback from this person is to keep each other grounded in the process you are using and to help validate that the evidence received is substantiated. It is very easy sometimes to start pushing any root cause process into one’s biased direction once the energy gets flowing. The trained TapRooT® investigator and peer will remind you to slow down and let the TapRooT® process guide you to the root causes.
3. Novice: There are two types of novices to get feedback from, one that is not trained in the TapRooT® process and one that is not familiar with the investigation or process being investigated.
There is a natural tendency that the more you know about the process you are investigating, the less that you put down on paper. After all, everyone knows how that thing works or what happened. Why do I need to write it down? Simple… “What does not get written down does not get investigated!” As the novice asks you more questions to understand the root cause analysis that you are performing, the more you explain and the more you write down.
4. Formal Auditor: The formal auditor usually audits the root cause analyses after they are completed and the corrective actions have been implemented. There is less communication and engagement between you and the auditor, which is very different than the first peers listed above.
The value of this feedback is that it is designed to look for consistency and standardization across multiple root cause analyses. The auditor may find investigations that need to be recalibrated but may also find new and better ways of doing an investigation based on other’s unique techniques. We also encourage auditors to have taken our 2-Day Advanced Trending Techniques Course, to help look for trends.
The final plus for this feedback activity…..
“Everyone learns something and recalibrates their Root Cause Analysis Techniques and we all help meet the goal of Reduce, Mitigate or Eliminate a Problem!”
The new EPA emission regulation (not yet published in the Federal Register, but available here), requires a root cause analysis and corrective actions for upset emission releases including flare events.
Not only is a root cause analysis with corrective actions required, but a second event from the same equipment for the same root cause would trigger a diviation of the standard (read “fine”). In addition, the same device with more than 3 events per 3 years or the combination of 3 releases becomes a deviation.
This means it is time for effective, advanced root cause analysis of emission events. Time to send your folks to TapRooT® Root Cause Analysis Training!
Many people ask “What makes a good RCA?”. This question as stated is difficult to answer due to the fact that “good” is a very subjective term. What is good for one may not be good enough for another and vice versa. But, if we replace one simple word in that question we can make it a much more objective question. By changing that term to “credible” and/or “effective,” now we have a good starting point as both these terms have investigative standards and principles behind them. Let’s start with some definition:
Credible: This term is defined as “able to be believed; convincing.” Let’s focus on an investigation for our example and ask what would make our investigation able to be believed? One simple answer comes to mind, the ability to see the relationship between our Root Causes, our Causal Factors, and our Incident. That “Specific” relationship as we call it is dependent on the data collected in an investigation and ability for your audience to be able to connect those “dots” if you will.
Effective: This term is defined as “successful in producing a desired or intended result”. This focuses on the outcome of an action and what the desired results or end point is. For investigations the outcome or desired result is to implement fixes and Corrective Actions that will in the future reduce the risk of or remove the risk of a reoccurrence. The audience’ ability to see the effectiveness of the Corrective Actions is key.
So if we add both these words together and use them in combination to define an investigation we can now see how to answer the initial question.
Credible Root Cause Analyses
Let’s begin with the word credible and provide some guidance for our TapRooT® Users. When I look at and review any investigation the credibility is established for me in two techniques, the SnapCharT® and the Root Cause Tree®.
Let me put this as simply and as plainly as I can, when building your chart the team should put ALL information into that SnapCharT®. No matter how insignificant something may seem, or how common place something may be it should be on the chart for transparency and for use during the analysis on the Root Cause Tree®. Anytime you make a conscious effort to leave information off the chart you open yourself up for questions and you reduce the ability of your audience to “connect the dots” as mentioned above. This lowers your credibility significantly.
This can also lead to issues when your audience tries to understand the relationship between the Root Causes you have chosen on the Root Cause Tree® and the information on the SnapCharT®. This relationship should be as “transparent” as possible and the audience should not have to work to figure out the relationship. There should be a direct link between data on the chart and the Root Causes from the Root Cause Tree®.
Root Cause Tree®
Once the thorough “transparent” SnapCharT® is completed and the investigation move into the Root Cause Tree® to analyze your Causal Factors, documentation is the key to credibility. Three statements that can kill credibility are: “I believe,” “I think,” and “I am pretty sure,” Each one of these statements provides your audience with doubt as to what you truly know. This is why I always recommend the use of the Root Cause Tree® Dictionary and Analysis Comments in the Root Cause Tree® for documentation. This provides the connection and the defendable path for you and your audience.
As the Tree is analyzed the investigation should have data from the SnapCharT® to confirm each selection on the Root Cause Tree® as well as one or many questions answered as a yes from the Dictionary. Take that data (cut and paste) and put that into the Analysis Comments in the TapRooT® Software to document “why” you answer yes, and to show the audience your reasoning.
We have explored the “Credibility” of analyses so now we need to look at the Effective portion. We concluded that this measure is tied to the effectiveness of the Corrective Actions we present and implement. An analysis by itself cannot be effective without corrective and preventative actions that solve the Root Causes and prevent or reduce the likelihood of recurrence in the future.
When developing our Corrective Actions for the Root Causes we find during our analyses we have to consider the following items for each action:
Implementation: The act of putting the specific action in place in our systems and organization
Verification: This is a short-term measure of implementation. How are we going to ensure that what we proposed as the Corrective Action was implemented properly.
Validation: This is a long-term measure of effectiveness. This plan is based around the question, “What will success look like?” built with a plan to measure the progress (or regression) towards that outcome.
Most companies do a pretty effective job of the Implementation phase, implementing actions for every root cause. But in follow-up to these actions they do nothing; seemingly they wash their hands of the issue and say they are done. Implementation by itself does not ensure success. The two measurements above are very important because the provide some level of oversight for the actions and are a quality control check to make sure the actions hit the mark. If for any reason the Validation shows that the action is not having the desired effect the action needs to be revisited and revised if necessary starting the cycle again.
If Corrective Actions are implemented and not measured you increase the risk of the implementation falling short, or the action itself not actually having a positive impact on your systems and employees.
In the end, the credibility of your analysis is dependent on the data you collect, the quality (not quantity) of the data, and how it is used to confirm any answers found on the Root Cause Tree®. The effectiveness is dependent on the success of the corrective actions implemented and the longer term sustained success of the changes in the system to stop future recurrence. By following the 7-Step Process flow, and the Core techniques highlighter here within the TapRooT® process the system will guide you through these steps and aid you in successfully providing your management with a very Credible and Effective Investigation.
Want to learn more?
Our 5-Day TaprooT® Advanced Root Cause Analysis Team Leader Training provides all of the essentials to perform a root cause analysis plus advanced techniques. You also receive a single user copy of TapRooT® software in the 5-day course. The software combines incident identification, analysis, and dynamic report writing into one seamless process.
We’re going to start off with a session that lets our TapRooT® Users Share Best Practices with you. What better way to learn about best practice ideas than to get them directly from other users who are successfully implementing at their company? This session is always one of the most useful, hands-on sessions at our Summit.
Are your investigations any good? Mark Paradies is going to run a session on how to properly Grade Your Investigations. He’ll give you industry best-practice ideas on what really matters in an investigation. This isn’t just a high-level thought experiment; you’ll walk away with a grading sheet that you can customize for your company that will allow you to put an actual number grade on each investigation. This is a great way to improve each and every investigation.
Are you ready to implement TapRooT® at your company, but not sure how to get started? Ed Skompski will be leading a session on How to do a New Implementation of TapRooT®. We’ve helped many companies figure out what they need to do for a successful, long-term TapRooT® implementation, and he’s ready to share those ideas with you. Wouldn’t you love to have an implementation checklist?!
Barb Phillips is going to have a couple of sessions on individual investigator techniques. Interviewing Behaviors and Body Language will help you learn advanced skills to perform more effective interviews. The Investigation Team Leader session will give you pointers on how to lead more complex investigations from the perspective of the lead facilitator. These are both going to be terrific sessions for all TapRooT® investigators.
Wouldn’t it be nice to ask all the right questions at your very first interview? In addition to Barb’s topic on body language, we’re going to look even more deeply at how to use The 15 Questions To Identify Interview Topics. This will help you better plan your interviews to ensure you minimize the number of follow-up interviews required during your investigations.
We’ve had many requests to speak in more detail about the intricacies of a few of the Root Cause Tree® Basic Cause Categories. This year, we are going to Deep Dive into Procedures and Management Systems. We’ll show you where some of these root causes come from, and how to recognize them during your investigations.
I’m really excited about my track this year. If you are a TapRooT® investigator, you will not want to miss these sessions. Plan on being in San Antonio next year; don’t miss out on this opportunity!!
Learn more: http://www.taproot.com/taproot-summit/
How important is root cause analysis to your business? Before you answer that, think about your completed investigations to find out if your efforts are working. Here are three indicators that root cause analysis (RCA) is not important to your business.
- The RCA stops short. There are many reasons a RCA will stop short of finding the real root causes. Political expediency, lack of valuable training, and failure to base the RCA on facts and evidence are high on the list among them. When the investigation stops after the first root cause is found, the problem will occur again because other paths to the problem were not identified and corrected. Does your RCA system offer a way to examine a complete set of events and conditions so you never stop short? Does it offer a way to document all of this efficiently?
- Weak corrective actions. Are you finding solutions that fix the problem or are you simply treating a symptom of the problem? It’s easy to tell. If the incident happens again after corrective actions were implemented, your corrective actions are only treating symptoms. Your system may not be offering you well-developed definitions or giving you the questions to ask so that you are not forced to rely on your own opinions and knowledge for fixes. Your system of root cause analysis is wasting your time. A good RCA system clearly guides you to define the problem, measure and analyze it, and develop effective corrective actions.
- Lessons not learned. Is your organization learning from past incidents? Remembering what has occurred and how it was fixed will help your organization stay proactive. A focused set of root causes and effective corrective actions are important, but don’t forget the lesson. A solid RCA system will help you identify generic causes. When you correct a generic cause, you’ll prevent problems from occurring across your organization. Less work for you, more success for your business!
If you’ve noticed that your facility is running into one or more of the problems above, it’s time to consider training to make RCA more important to your business. RCA is mission critical knowledge worth the investment in training. Is the cheap answer working for you? Remember, you get what you pay for.
There is still time to grab a seat in our 2-Day TapRooT® Incident Investigation & Root Cause Analysis courses where you will learn ALL of the essentials to conduct a root cause analysis in just two days. (CLICK HERE to learn more or register.)
And we have a few seats left in our 5-Day TapRooT® Advanced Root Cause Analysis & Team Leader Training courses where you will learn all of the essentials and advanced techniques PLUS receive a copy of our root cause software that will help you every step of the way. (CLICK HERE to learn more or register.)
I had a great conversation with one of our clients today. He mentioned that, with the price of oil below $50/barrel, his company is being proactive and looking at ways to improve their processes. One thing that they’re doing is reviewing old incidents and seeing where the commonalities lie. One item of interest that they found: They have discovered several instances of repeat problems. They found root causes, but they seem to pop up again.
What they are doing is what we call a Generic Cause analysis. They are looking deeper at their data and finding opportunities for improvement.
A review of the corrective actions found quite a few that seemed to be more akin to immediate actions than corrective actions. For example, if there was a production shutdown that was caused by a failed valve, the corrective action was, “Remove and repair the faulty valve.” Now, I’m not saying that this is necessarily a bad idea. However, is this corrective action aimed at preventing future recurrence of the incident? This corrective action, by itself, is designed more to restore plant operation, not prevent future issues.
Remarkably, this turned out to be more a terminology issue than a failure of their system. When asked about this, the Corrective Action team said, “Oh, you want actions that will prevent the incident in the future? Then you aren’t asking for ‘corrective actions’; you are asking for ‘preventative actions.’ You have to be clear what you want.”
TapRooT® does not really distinguish between these types of fixes. We expect all of these actions to be developed. We find that the “corrective actions” noted above are designed to fix the Causal Factor (“Valve failed to open. Therefore, repair the valve.”). This is as far as most other “root cause analysis” systems go, since they normally only get to the causal factor level.
While you DO need to fix these issues, we also want you to continue to assign corrective actions to actual root causes. If the valve failed because the repair procedure specified the incorrect part (in TapRooT®, this root cause would be “facts wrong”), and you would therefore put a corrective action in place to fix this human-performance problem (“Update repair standard to indicate the correct gasket part number 885-33425″). This will fix not just this particular valve, but any valves in the future that are repaired using this same repair procedure. Without this corrective action, you will see this same issue pop up again as we continue to improperly repair valve failures.
Make sure you and your investigation teams are all on the same page. We use the term “Corrective Action” to indicate any and all actions that are designed to fix problems. Corrective Actions include those actions that fix the general issue (failed valve), and those that are designed to prevent the issue from occurring again in the future (procedure wrong). Take a look at your systems and make sure you are fixing both types of problems.
Sign up to receive tips like these in your inbox every Tuesday. Email Barb at firstname.lastname@example.org and ask her to subscribe you to the TapRooT® Friends & Experts eNewsletter – a great resource for refreshing your TapRooT® skills and career development.
What methods do you use to trend your accident data?
Is a decrease in lost time injuries seen as a success?
Is in an increase in injuries a failure and time for drastic action?
Are you using bar graphs or a simple line graph to look for trends?
How much change in the stats is enough for you to tell management that significant change has occurred?
That’s what I learned about 20 years ago and have been perfecting ever since.
How can you learn them too?
First, you can read Chapter 5 of the TapRooT® Book.
Second, you can study advanced trending and learn to use XmR Charts to trend accident data (both frequently occurring data and infrequently occurring data).
Third, you can attend SI’s TapRooT® Advanced Trending Techniques Course. You can hold one on your site for your people or the public course we hold each year just before the TapRooT® Summit.
When is the next public course?
On August 1-2, 2016 in San Antonio, TX. That’s a long time but it is the first public course. Mark your calendar to save the dates.
What will you learn at the TapRooT® Advanced Trending Techniques Course?
See the course outline here: http://www.taproot.com/courses#2-day-adv-trending
Also, read this article about picking targets for improvement to learn a little more before you attend a course:
And if you are interested in learning more about advanced trending, ask us about having a course at your site. Contact us by CLICKING HERE.
This week I would like to ask the question…what is the difference between a safety incident and a quality problem?
Before you answer that, let me tell you that this is a trick question.
The answer is……drum roll please: there is NO DIFFERENCE. The difference in a safety problem vs. a quality problem is the consequence; there is no difference in the approach you take in investigating.
In TapRooT®, the first thing we always do is to create a SnapCharT®. And the first thing we do when creating a SnapCharT® is to define the incident with a circle. This defines the scope of your investigation. Your circle could contain anything that creates pain for your company and that you would like to prevent from happening again. Examples of things that might go in your circle:
• Lost time injury
• Recordable injury
• Vehicle accident
• Facility damage
• etc. etc.
• Defective product (not sent to customer)
• Defective product (sent to customer)
• Customer complaint
• Delayed shipment
• etc. etc.
Once you have defined the incident, you map out what happened, define the causal factors, perform root cause analysis, and develop corrective actions.
So start thinking about different ways your company can use TapRooT®. I’ve mentioned Safety and Quality, but there are many more. equipment reliability, environment, security, project delays; the list is really endless.
The more ways you can use TapRooT®, the better ROI you will get from your training. I know from experience when different disciplines in an organization start speaking the same language, there are some great intangible benefits as well. So if you are a safety manager, drag your quality manager with you to training next time. You will be glad you did.
Thanks for visiting the blog and best wishes for your improvement efforts.
Would you like to receive tips like these in your inbox? Our eNewsletter is delivered every Tuesday and includes root cause tips, career development tips, current events and even a joke. Contact Barb at email@example.com to sign up for the TapRooT® Friends & Experts eNewsletter.
I know how this works. You get the notification that “something bad” happened, and you are assigned to perform a root cause analysis. Your initial reaction is, “There goes the rest of my week!”
However, there is no reason that a relatively simple analysis needs to take an inordinate amount of time. There are several things you can do to make sure that you can efficiently conduct the investigation, find solid root causes, and implement effective corrective actions. Here are a few ideas to help you make the process as smooth as possible.
1. The first thing that needs to be in place is a Detailed Investigation Policy for your company. When does a RCA need to be performed? What types of problems trigger an RCA? What is the decision-making chain of command? Who makes the notifications? Who is notified? Who will be on the team? All of these questions need to be easily answered in order to quickly get the process started. I have seen investigators receive notification of a problem over a week after the actual incident. By this time, evidence has been lost, key players are no longer available, and peoples’ memories have faded. All of this makes the investigation just that much harder. If you can streamline this initial decision-making and notification process so that the investigation can start within hours, you’ll find the actual investigation goes MUCH more smoothly.
2. Probably the biggest timesaver is to Be Proficient in the TapRooT® Process. We recommend you use TapRooT® at least once per month to maintain proficiency in the system. You can’t be good at anything if you only use it sparingly. I often hear people tell me, “Luckily, we don’t have enough incidents to use TapRooT® more than once per year.” Imagine if I asked you to put together an Excel spreadsheet using pivot tables, and you haven’t opened Excel since 2014! You’d have to relearn some key concepts, slowing you down. The same is true of an investigation process. If you only do an investigation once each year, you aren’t looking very hard for incidents. I’ll guarantee there are plenty of things that need to be analyzed. Each analysis makes you that much better at the process. Maybe go back to point #1 above and update your investigation trigger points.
3. When you actually get started on an investigation, the first thing you should do is Start A Spring SnapCharT®. This initial chart gets your investigator juices flowing. It helps you think about the timeline of the incident, identifying holes in your knowledge and questions you need to ask in order to fill those holes. It is the first step in the process. As soon as you get that initial phone call, start building your SnapCharT®!
4. Finally, although it is optional, The TapRooT® Software can really speed up your analysis. The SnapCharT® tool is extremely user friendly, and the Root Cause Dictionary is only a right-click away. It guides you through the investigation process so you don’t have to try to remember where you’re going.
You won’t perform an investigation in 5 minutes. However, by following these tips, you relatively quickly and efficiently move through the process, with terrific results.
To learn more about learning all of the essential techniques to perform a root cause investigation, read about our 2-Day TapRooT® Incident Investigation and Root Cause Analysis Course.
Found an interesting old (2000) report from the UK HSE about incident/accident investigations. They had a contractor perform surveys about accident/incident investigation tools and results.
TYPE OF INVESTIGATION SYSTEM
It seems that homegrown investigation systems or no system were the most frequently used to investigate accidents/incidents.
With that type of investigation system, it should be no surprise that the three top corrective actions were:
- Tell them to be more careful/aware.
- Training/refresher training
- Reinforce safe behavior (Is that discipline?)
That’s what we found back in the early 1990’s.
Think it has changed any today?
HOW MUCH TIME SPENT INVESTIGATING?
Another interesting fact. How long did people typically spend doing investigations?
- 42% took 5 hours or less
- 35% took 5 to 20 hours
- 18% took over 20 hours
One third of those polled had NO accident/incident investigation training. Most of the rest just had general health and safety training as part of IOSH or NEBOSH courses. Also, most people performing investigations were not dedicated health and safety professionals.
What do you think? Is this similar to your experience at your company?
The report then provided a review of example investigations that the researchers had reviewed. As an expert in root cause analysis, these were awful but typical. many just filled out a form. Others grilled people and decided what they thought were the causes and the corrective actions.
HOW ARE YOU DOING?
Are you 15 years behind with no system, no training, and bad results?
Then you need to attend a TapRooT® Course. See: http://www.taproot.com/courses
Have you started to improve but still have a long way to go? You might want to attend one of our public 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training Courses. See: http://www.taproot.com/store/5-Day-Courses/
Are you good to great but you want to be even better? The plan to attend the 2016 Global TapRooT® Summit in San Antonio, TX, on August 1-5. We’ll be posting more details about the Summit soon.
Jack Frost discusses an idea to improve investigations he learned during Mark Paradies’ best practice session at the 2015 Global TapRooT® Summit.
DOWNLOAD the free tool he refers to HERE.
And make plans to join us at the next Global TapRooT® Summit, August 1 – 5, 2015 in San Antonio, Texas!
One of the first dilemmas facing any investigator is deciding what data you need and what you have to have. There are many theories on this topic, but one rule of thumb I like to use is to, work your way around the target.
In the 7-Step Process Flow, Step 1 + 2 comprise the “What” portion of the investigation where we begin the process of trying to understand what our Incident is, and what let up to and followed it. During this time we work with our SnapCharT® to aid us in organizing and understanding the data we collect.
Data comes in many forms including our 3P’s + R; People, Paper, Plant, and Recordings. All of these forms of data are important to helping us understand the initiation of and the genesis of an incident. These different data types fit together to weave a picture of the incident and provides us with the basis for our analyses moving forward. For most, the first place we start is with anyone involved with the incident to get their first hand accounts of “What” happened and “Why”. By taking this very simplistic approach we can begin the process of vetting out truth and fiction, fact and opinion. But this is only one subset of people we must interview to fully understand an incident.
Working your way around the target means thinking about anyone who might have influence on an incident and making sure we interview and gather all perspectives.
Think of the incident/accident as a target, with concentric rings moving outward from the center. Each ring further from the center has less direct knowledge but can provide very valuable perspective. We need to understand everything surrounding the incident to fully evaluate for and understand the root causes. So there are various levels of knowledge and influence that we should consider.
Inner Circle – Those Involved: These people will give you your best starting point and the most information directly related to the incident/accident. Most direct knowledge will be found here.
Second Circle – Those Around or Near the Incident: This group can provide interesting information that might help you piece together what you know. They may or may not have any direct knowledge but can provide things such as what was heard, felt, smelled, tasted and sensed. Much of this can be used by the investigator understand information provided by those involved and can many times provide very simple yet important pieces to the investigative puzzle.
Third Circle – Subject Matter Experts (SMEs): When trying to understand a process or failure, SMEs provide an invaluable resource. Their knowledge can allow the investigator to understand successful performance and what should have happened. This can be used to understand both Equipment and Human Performance related failures. By understanding proper performance the investigator can more easily identify where potential failures exist. By understanding how processes or systems fail we can more easily identify Causal Factors.
Fourth “Dotted” Circle – Those with Influence on the Incident: Working with many investigative teams I have found that many facilitators only consider those members of management with direct involvement in the incident. That could include a direct line supervisor or a local area manager. By approaching this group in this manner the investigator can lose sight of a very important piece of work culture. That missing piece is the “Expectation”. What is communicated and expected can many times be in opposition and create confusion or problems in the work place. I believe to fully understand an incident and its causes the investigator should reach out to differing layers of management and talk about what is the “Expectation” for performance of whatever jobs are involved. This “Expectation” can not be used as fact but can aid in the understanding of decisions made and actions taken by those involved in the incident.
If the investigator works their way around the target, and ensure that these different perspectives are understood, they will have a better more thorough understanding of the incident and can perform a more thorough and complete root cause analysis.
To learn more about interviewing techniques, register for our 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training.
This is not the Friday Joke.
Root cause analysis has become so popular that politicians are now calling for companies to complete a root cause analysis and implement corrective actions.
Massachusetts Governor Charlie Baker wrote a letter to Entergy Nuclear Operations calling on them to “… perform an appropriate root cause analysis …” of safety issues the NRC had announced “… and to complete all necessary repairs and corrective actions.”
The letter was in response to an unplanned shutdown at the Pilgrim nuclear power plant in Plymouth, Massachusetts caused by a malfunctioning main steam stop valve (one of eight valves that is designed to shut off steam from the reactor to the turbine that generates electricity). The valve had failed shut.
For all those not in the nuclear industry, note that in the nuclear industry, a failure of one of eight valves that failed in the safe direction (shut) and that has backup safety systems (both manual and automatic) can get a public letter from the Governor and attention from a federal regulator. Imagine if you had this level of safety oversight of your systems. Would your equipment reliability programs pass muster?
The response from Entergy to the Governor noted that, “We have made changes and equipment upgrades that have already resulted in positive enhancements to operational reliability.” (Note that these fixes occurred in less than a week after the original mechanical failure.)
For more about the story, see: http://www.wbur.org/2015/09/03/baker-pilgrim-nuclear
Note the local NPR story at the link above is inaccurate in its description of the problem and the mechanical systems.
For those interested in improving equipment reliability and root cause analysis, consider attending one of our 3-Day TapRooT®/Equifactor® Equipment Troubleshooting and Root Cause Analysis Courses. See the upcoming course list at:
Now for the biggest question …
When will government authorities start applying root cause analysis
to the myriad of problems we face as a nation and start implementing appropriate corrective actions?
What is the minimum investigation for a simple incident?
Before you can answer this question, you need to decide the outcome you are looking for. For example:
- Do you just want to document the facts?
- Would you be happy with a simple corrective action that may (or may not) be effective?
- Do you need effective corrective actions to prevent repeats of this specific incident?
- Do you want to prevent similar types of incidents?
The answers to these questions depend on two factors that determine risk:
- What were the consequences of this incident and could things have happened slightly differently and had much worse consequences?
- What is the likelihood that this type of incident will happen again?
Of course, before you start an investigation, answering these two questions may be difficult. Before you start an investigation, you don’t really know what happened! But in spite of this lack of knowledge, someone must decide if an incident is worth investigating and the resources to dedicate to the investigation.
I’ve seen simple incidents that, when investigated, revealed complex problems that could have caused a serious accident. Therefore, if a thorough investigation is not performed, the investigator may never know what they could have discovered. That’s why I caution management that something that seems simple may not be simple.
However, some incidents ARE simple. I’ve seen many incidents that people were investigating that were similar to this one:
An employee stumbles, falls, and sprains
his wrist while walking down a flat sidewalk.
He had on simple shoes with adequate tread.
He was not particularly preoccupied
nor was he entirely paying attention to each step
(just normal walking).
How much can be learned by investigating this incident? Probably not much. I would suggest that even though the person sprained his wrist, this incident should not be investigated beyond a simple recording of the facts so that the incident could be recorded for safety records (OSHA logs in the USA) and included in future incident trending.
You might ask:
“But what if the employee had stumbled and fell in front of an oncoming car and the employee killed?”
In that case, because of the consequences, a detailed major investigation would be required.
In either case, the TapRooT® Root Cause Analysis System could be used to complete the investigation.
The TapRooT® Root Cause Analysis System is a robust, flexible system for analyzing and fixing problems. The complete system can be used to analyze and fix complex accidents, quality problems, hospital sentinel events, and other issues that require a complete understanding of what happened and effective corrective actions.
I’m in the process of writing a new set of TapRooT® Books. The first one I’m writing is about investigating simple incidents using the basic tools of TapRooT®.
To give you a sneak preview, if you decide to investigate an incident, the minimum technique to use is a SnapCharT®.
From the initial SnapCharT®, the investigator must decide if the incident is worthy of further effort (can something worthwhile be learned).
What’s next? What do you do if you decide to go beyond the initial SnapCharT®?
You will have to wait for the new book to be released early next year to find out what we are recommending. But I can give you a hint ,,, It won’t be asking why five times!
A Look at 3 Popular Quick Idea Based Root Cause Analysis Techniques: 5-Whys, Fishbone Diagrams and BrainstormingAugust 26th, 2015 by Chris Vallee
Today’s root cause tip will walk through a few popular quick-idea based root cause analysis techniques used by many.
Do a quick search using Google or Yahoo search engines for “Root Cause Analysis Training” and these techniques often pop up in your internet browser: 5-Whys, Fishbone (Ishikawa) Diagrams, Brainstorming and of course, TapRooT® Root Cause Analysis. Now type in “free” or “quick root cause analysis templates” and you will not find TapRooT®. Is that good or bad? Of course my dad always taught me that what is earned and worked for was always more satisfying and led to a stronger sense of accomplishment. The end product also lasted longer.
Why would a person search for root cause analysis training on the Internet? If I were to brainstorm the whys as defined in dictionary.reference.com:
- a sudden impulse, idea, etc.
- a fit of mental confusion or excitement.
-1890-95; brain + storm; originally a severe mental disturbance
Then I might suggest the following “whys”:
- The person was bored.
- A student was doing research.
- A training department was assigned to find and schedule quick low cost training techniques that can be taught online.
- You were assigned to find good root cause training to solve problems.
Now those weren’t too many suggestions on my part. But there is hope, because brainstorming is best served in groups. As defined in wikeipedia.org:
Brainstorming is a group creativity technique by which efforts are made to find a conclusion for a specific problem by gathering a list of ideas spontaneously contributed by its members.
But we have to establish a few rules per wikipedia.org:
- Focus on quantity…. The more the merrier.
- Withhold criticism…. No why is a bad why and you might shut down the quantity given by others that were made fun of.
- Welcome unusual ideas
- Combine and improve ideas… we can build off other peoples’ whys for a really good why to solve a problem.
Okay with our new rules and group in place, we came up with more whys to why someone was searching for root cause analysis on the internet:
- The person was bored.
- A student was doing research.
- A training department was assigned to find schedule quick low cost training techniques that can be taught online.
- You were assigned to find good root cause training to solve problems.
- The current root cause techniques are not working very well.
- You are planning a party and this would be a great team game. (This one was my favorite suggestion)
Fishbone (Ishikawa) Diagrams
Brainstorming not quite good enough in our quest to solve why people are searching for root cause analysis on the internet you think? Let’s do a guided search for whys with our group using a Fishbone (Ishikawa) Diagram.
- Agree on a problem statement as a group. Ours is “why are people searching for root cause analysis on the internet?”
- The problem statement is placed at the head of the fish as seen in the diagram above.
- Now Brainstorm the major categories of the cause of the problem and list them underneath each category. For our fishbone from wikipedia.org, we are going use Methods, Machines, Material and Measurements.
a. Methods: How the process is performed and the specific requirements for doing it, such as policies, procedures, rules, regulations and laws
b. Machines: Any equipment, computers, tools, etc. required to accomplish the job
c. Materials: Raw materials, parts, pens, paper, etc. used to produce the final product
d. Measurements: Data generated from the process that are used to evaluate its quality
Caution, there are many categories to chose from which may lead the group into different directions each time they use one. We could have also used the categories as listed in wikipedia.org:
The 7 P’s
The 5 S’s
Here is our refined fishbone. I have to admit, it does look a little better than the brainstorming list above. Did not take that much time at all.
- As each idea is given, the facilitator writes it as a branch from the appropriate category.
- Again ask “why does this happen?” about each cause.
- Write sub-causes branching off the causes. Continue to ask “Why?” and generate deeper levels of causes. Layers of branches indicate causal relationships.
Item number 4 gets into looking for causal relationships within our suggested causes which leads into our 5 whys discussion next.
Let’s take one of the “causes” listed above and get to a good root cause with our group to understand why people are searching for root cause analysis on the internet?
Here are the simple instructions for performing a 5 Whys as listed in wikipedia.org:
5 Whys is an iterative question-asking technique used to explore the cause-and-effect relationships underlying a particular problem. The primary goal of the technique is to determine the root cause of a defect or problem by repeating the question “Why?” Each question forms the basis of the next question.
- Why are people searching for root cause analysis on the internet?
Answer: Because there is no database to search in on their computer and the boss wants training answers now.
- Why is there no database on the computer to search from?
Answer: Because these are computers produced in 1995 and a knowledge database cannot be installed.
- Why do we not have new computers that can have databases installed?
Answer the company is short money.
- Why is there no money left to purchase computers?
Answer: Because we have lost money on repeat incidents.
- Why do we have repeat incidents?
Answer: Because we do not have a good, effective, cost reducing Root Cause Analysis Process. I have a great solution for this problem….. look here for future courses in TapRooT® Root Cause Analysis.
Okay, I agree this was a very high level and superficial exploration of the 3 Popular Quick Idea Based Root Cause Analysis Techniques: 5-Whys, Fishbone Diagrams and Brainstorming.
However, the steps that we explored are valid steps and flow of the actual processes. The ending results from superficial creation of whys are very true and have been the cause for repeat problem occurrences.
If you are going to use these process, as they are often still required for everyday issue resolution for some and for others are actually considered their only root cause tools, then head off some of the issues with a couple of these best practice suggestions.
- Never start with Brainstorming. This is a great tool for suggesting corrective actions tied to actual root causes, but should not be used for evidence collection and figuring out why something happened.
What to do instead? Go Out And Look (GOAL). Never armchair troubleshoot from a conference table surrounded by people.
- Only use a Fishbone (Ishikawa) Diagram if:
a. You have collected evidence
b. You standardized and defined your fishbone cause categories
c. You have the right experts in the room
d. Cause or Corrective action ideas do not drive the actual what and why questions.
- Only use 5 Whys for trying to identify the actions or inactions that allowed an issue to occur and not the actual root causes. Why?
a. There is a tendency to look for only one cause when using the process; even if you ask 5 Whys for each action or inaction found on the Fishbone (Ishikawa) Diagram, there is still a tendency to look for only one cause in each section. I have never just had one cause for any problem that I have investigated.
b. It is not how many questions one asks but what one asks.
c. When used to collect evidence or understand evidence, there is a tendency for “group think” to occur that drives which direction the evidence and causation linkage goes. Look up the Space Shuttle issue tied to the o-ring failure for a group think example that was detrimental to life.
d. There is nothing to push the investigations outside what they know as a whole and what may be missing from the investigation. In that case, always bring in different knowledgeable and people new to the problem for constant checks and rechecks. Also look for outside industry best practices and knowledge to help get better investigations completed.
So in closing…..
- If it looks too easy and requires less work, you get what you put in it.
- If there is a large amount of guessing, you are also guessing at the corrective action.
- If the right expert is not in the room when using the tools explored, nobody will know what to ask or to verify.
- If the people using the process are the only thing driving the evidence collection, bias has a stronger natural tendency to take over.
I look forward to your examples of using these processes and also comments on some of the traps you did or did not avoid while using these 3 tools.