Category: Root Causes
For the 25th year, the AFL-CIO has produced a report about the the state of safety and health for American workers. The report states that in 2014, 4,821 workers were killed on the job in the U.S., and approximately 50,000 died from occupational diseases. This indicates a loss of 150 workers each day from hazardous conditions.
READ the full report.
The modifications have been published in the Federal Register. See:
To see the previous article about the modifications and their impact on root cause analysis, see:
Hurry if you want to submit comments. The register says:
“Comments: Comments and additional material must be received on or before May 13, 2016. Under the Paperwork Reduction Act (PRA), comments on the information collection provisions are best assured of consideration if the Office of Management and Budget (OMB) receives a copy of your comments on or before April 13, 2016.Public Hearing. The EPA will hold a public hearing on this proposed rule on March 29, 2016 in Washington, DC.”
April 13, 2016, isn’t far away!
For comment information, see:
To add your comment, see:
As we on focus patient safety during this week, I thought it prudent to examine one of the more important aspects of providing a safe environment of care for our patients, the use of Root Cause Analysis (RCA) to prevent future events. If we perform very thorough objective analysis, we can build corrective and preventative measures that will improve our systems and reduce or remove the chances for future similar events.
In the case study below, we’ll examine a medication error that affected one patient, could have affected two patients (due to swapped medications) but did not due to the quick response by the treatment team. Learn to better analyze and create a safer environment for our patients, staff, and community.
DOWNLOAD this white paper.
If it is written down, it must be followed. This means it must be correct… right?
Lack of compliance discussion triggers that I see often are:
- Defective products or services
- Audit findings
- Rework and scrap
So the next questions that I often ask when compliance is “apparent” are:
- Do these defects happen when standard, policies and administrative controls are in place and followed?
- What were the root causes for the audit findings?
- What were the root causes for the rework and scrap?
In a purely compliance driven company, I often here these answers:
- It was a complacency issue
- The employees were transferred…. Sometimes right out the door
- Employee was retrained and the other employees were reminded on why it is important to do the job as required.
So is compliance in itself a bad thing? No, but compliance to poor processes just means poor output always.
Should employees be able to question current standards, policies and administrative controls? Yes, at the proper time and in the right manner. Please note that in cases of emergencies and process work stop requests, that the time is mostly likely now.
What are some options to removing the blinders of pure compliance?
GOAL (Go Out And Look)
- Evaluate your training and make sure it matches the workers’ and the task’s needs at hand. Many compliance issues start with forcing policies downward with out GOAL from the bottom up.
- Don’t just check off the audit checklist fro compliance’s sake, GOAL
- Immerse yourself with people that share your belief to Do the Right thing, not just the written thing.
- Learn how to evaluate your own process without the pure Compliance Glasses on.
If you see yourself acting on the suggestions above, this would be a perfect Compliance Awareness Trigger to join us out our 2016 TapRooT® Summit week August 1-5 in San Antonio, Texas.
Here’s the press report about an incident at a west coast refinery …
They think that someone working in the area accidentally hit a button that shut down fuel to a boiler. That caused a major portion of the refinery to shut down.
At least one Causal Factor for this incident would be “Worker accidentally hits button with elbow.”
If you were analyzing this Causal Factor using the Root Cause Tree®, where would you go?
Of course, it would be a Human Performance Difficulty.
When you reviewed The Human Performance Troubleshooting Guide, you would answer “Yes” to question 5:
“Were displays, alarms, controls, tools, or equipment identified or operated improperly?”
That would lead you do evaluating the equipment’s Human Engineering.
Under the Human-Nachine Interface Basic Cause Category, you would identify the “controls need improvement” root cause because you would answer “Yes” to the Root Cause Tree® Dictionary question:
“Did controls need mistake-proofing to prevent unintentional or incorrect actuation?”
That’s just one root cause for one Causal Factor. How many other Causal Factors were there? It’s hard to tell with the level of detail provided by the article. I would guess there was at least one more, and maybe several (there usually should be for an incident of this magnitude).
At least one of the corrective actions by the refinery management was to initially put a guard on the button. Later, the button was removed to eliminate the chance for human error.
Are there more human-machine interface problems at this refinery? Are they checking for them to look for Generic Causes? You can’t tell from the article.
Would you like to learn more about understanding human errors and advanced root cause analysis? Then you should attend the 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training. See public course dates at:
And click on the link for the continent where you would like to attend the training.
President Obama issued Executive Order 13650 that directed agencies to improve chemical safety performance. In response, the EPA is proposing changes to the RMP (Risk Management Plan) regulation. A preliminary copy of the changes have been published HERE (they have not yet been published in the Federal Register).
For readers interested in root cause analysis, the main changes start on page 28 in the Incident Investigation and Accident History Requirements section.
The revision to the regulation actually mentions “causal factors” and “root causes” that were not mentioned in the previous regulation. On page 33 the revision states:
Thus EPA is proposing to require a root cause analysis to ensure that facilities determine
the underlying causes of an incident to reduce or eliminate the potential for additional accidents
resulting from deficiencies of the same process safety management system.
The EPA document uses the following definition of a root cause:
Root cause means a fundamental, underlying, system-related reason
why an incident occurred that identifies a correctable failure(s) in management systems.
The revision document gives examples of poor investigations of near-miss accidents that did not get to root causes so that a future accident that included a fatality or severe injuries occurred. These examples include and explosion and fire at a Tosco refinery, an explosion at a Georgia-Pacific Resins facility, an explosion an fire at a Shell olefins plant, and a runaway reaction at a Morton International chemical plant. In each case, root causes of issues were not identified and fixed and this allowed a more serious accident to eventually occur.
Of course, I have said many times that I’ve never seen a major accident that didn’t have precursor incidents (call them near-misses if you must). Performing adequate root cause analysis of smaller incidents has always been one of the goals that we have suggested to TapRooT® Users and now even more fully support with the new Using the Essential TapRooT® Techniques to Investigate Low-to-Medium Risk Incidents book.
The document asks for comments on the proposed revision to the regulation (page 41):
- EPA seeks comment on whether a root cause analysis is appropriate for every RMP reportable accident and near miss.
- Should EPA eliminate the root cause analysis, or revise to limit or increase the scope or applicability of the root cause analysis requirement?
- If so, how should EPA revise the scope or applicability of this proposed requirement?
- EPA also seeks comment on proposed amendments to require consideration of incident investigation findings, in the hazard review (§ 68.50) and PHA (§ 68.67) requirements.
- Finally, EPA seeks comment on the proposed additional requirement in § 68.60 to require personnel with appropriate knowledge of the facility process and knowledge and experience in incident investigation techniques to participate on an incident investigation team.
In the document, there is extensive discussion about defining and investigating near-misses. The section ends with …
- EPA seeks comment on the guidance and examples provided of a near miss.
- Is further clarification needed in this instance?
- Should EPA consider limiting root cause analyses only for incidents that resulted in a catastrophic release?
The document also discusses time frames for completing investigations. Should it be 30 days, 60 days, six months? It’s interesting to note that many investigations of process safety incidents by the US Chemical Safety Board takes years. The EPA is suggesting that a one year time limitation (with the possibility of a written extension granted by the EPA) be the specified time limit.
The EPA is asking for feedback on this time limit:
- EPA seeks comment on whether to add this condition to the incident investigation requirements or whether there are other options to ensure that unsafe conditions that led to the incident are addressed before a process is re-started.
- EPA also seeks comment on whether the different root cause analysis timeframes specified under the MACT and NSPS and proposed herein will cause any difficulties for sources covered under both rules, and if so, what approach EPA should take to resolve this issue.
The document also discusses reporting of root cause information to the EPA and suggests that common “categories” of root causes be reported to the EPA. The document even references an old (1996) version of the TapRooT® Root Cause Tree® and a potential list of root cause categories, They then request comments:
- EPA seeks comment on the appropriateness of requiring root cause reporting as part of the accident history requirements of § 68.42, as well as the categories that should be considered and the timeframe within which the root cause information must be submitted.
Although I am flattered to be the “father” of this idea that root causes should be reported so that they may be learned from, I’m also concerned that people may think that simply selecting from a list of root causes is root cause analysis. Also, I’ve seen many lists of root causes that had bad categorization. The main problem is what I would call “blame” categorization. I’m not sure if the EPA would recognize the importance of the structure and limits that need to be enforced to have a good categorization system. (Many consultants don’t understand this, why should the EPA?)
As everyone who reads the Root Cause Analysis Blog knows, I am always preaching the enhanced use of root cause analysis to improve safety, process safety, patient safety, quality, equipment reliability, and operations. But I am hesitant to jump aboard a bandwagon to write federal regulations that require good management. Yes, I understand that lives are at stake. But every time a government regulation is written, it seems to cement a certain protocol and discourages progress. Imagine all the improvements we have made to TapRooT® since 1996. Would that progress be halted because the EPA cements the “categorization” of root causes in 1996? Or even worse… what if the EPA’s categories include “blame” categories and managers all over the chemical industry start telling investigators to stop looking for other system causes and find blame related root causes? It could happen.
I would suggest that readers watch for the publication of EPA’s revision of the RMP in the Federal Register and get their comments in on the topics listed above. You can’t blame the EPA for making bad regulations if you don’t take the opportunity to comment when the comments are requested.
Monday Accident & Lessons Leaned: Sure Looks Like an Equipment Failure … But What is the Root Cause?February 25th, 2016 by Mark Paradies
When you look up in the air and this is what you see … it sure looks like an equipment failure. Bit what is the root cause?
That’s what DTE Energy will be looking into when they investigate this failure.
How do you go beyond “It broke!” and find how and why and equipment failure occurred? We recommend using techniques developed by equipment expert Heinz Bloch and embedded in the Equifactor® Module of the TapRooT® Software.
For more information about the software and training, see:
“The actor, Harrison Ford, was struck by a hydraulic metal door on the Pinewood set of the Millennium Falcon in June 2014.”
“The Health And Safety Executive has brought four criminal charges against Foodles Production (UK) Ltd – a subsidiary of Disney.”
“Foodles Production said it was “disappointed” by the HSE’s decision.”
Read more here
While reading Sentinel Event Alert 55 (SEA-55) from TJC issued September 28, 2015 on Fall Prevention, it occurred to me that TapRooT® can be used to aid in finding the root causes of the fall. Even more importantly, TapRooT® can be used to aid in maintaining your fall prevention program to ensure long-term success. The TJC lists the following common contributing factors (in TapRooT® these would be called “Causal Factors“):
- Inadequate assessments
- Communication Failures
- Lack of adherence to protocols and safety practices
- Inadequate staff orientation, supervision, staffing levels and skill mix
- Deficiencies in the physical environment
- Lack of Leadership
While these are good guidelines for what to look for and what data to gather, to us these do not represent root causes. These 6 items almost match up with most of the 7 Basic Categories on the back of our Root Cause Tree®. So as TapRooT® investigators, know you have to dig a bit deeper to find the true causes and define those at the Root Cause level not at the causal or contributing level.
All this being said, the more important reason I wanted to write this article is to highlight the use of your TapRooT® tools by using them for Proactive measures. How to examine and improve your fall management program and maintain continued success. Too many times we don’t think about the power of observation and the idea of raising awareness through communication. Each of these can be highlighted through the Proactive Process Flow below:
In SEA-55, two of the actions suggested by TJC were to 1) Lead an effort to raise awareness of the need to prevent falls resulting in injury and 2) Use a standardized, validated tool to identify risk factors for falls. These two items can benefit from the TapRooT® tools directly.
Starting with step 1 above in the Proactive Flow, use the SnapCharT® tool to outline the steps in patient assessment, highlight the steps that can or will affect the fall prevention portion of patient care, then use this flow as the basis for an observation program. By getting out and observing actual performance in the field you can do two things, show your concern for patient safety (and falls in this case) and gather actual performance data. These observations can be performed both in a scheduled and/or random fashion and can be done in any setting (ambulatory, non-ambulatory, clinic et cetera).
During the observation, document findings on the SnapCharT® and identify potential “Significant Issues” as they apply to fall prevention. This data can then be either evaluated using the Root Cause Tree® to define the areas of need for that single observation, or the data can be combined with other fall prevention observation data for use in an aggregate analysis or common cause analysis. With the aggregate analysis data from multiple observations can be combined, and “Significant Issues” can be identified based on multiple observations before an analysis using the Root Cause Tree® is performed. This could give you an overall bigger picture view of your processes.
Once the RCA is performed (in either situation), Steps 5-7 can be simply followed to produce some recommended actions to be implemented and measured using Corrective Action Helper® and SMARTER. And the beauty of this Proactive process is that you have not waited for a fall to learn. You and your organization are preventing future issues before they manifest thus showing your patients and staff that you truly care about their safety.
If you would like to learn more about using your TapRooT® tools proactively you can contact me at Skompski@taproot.com for more information or you can attend any of our public seminars, 2-day or 5-day to learn more on both the reactive and proactive use of the TapRooT® tools!
- Training. Retrain everyone, not just those involved.
- Policies/Procedures. Write new policies or procedures or make the current ones longer.
- Discipline. Send a message to everyone else that a behavior is unacceptable whether or not there is fault.
When these are the standard actions, many times we have recurrence of events. I am not saying these actions can’t work, but many times if they are default answers it is much like putting a round peg in a square hole.
In this article a hospital in Hong Kong presents an overview of their findings and recommended actions to a Sentinel Event at the hospital. Review the Corrective Actions and ask these two questions:
1. Do they meet the needs of the system based on the findings?
2. Do you see a correlation with our three standard corrective actions above?
Maybe there is a pattern… let us know your thoughts.
Here is a link to the significant incident report:
It seems from the report that the appropriate seat belt was present. Therefore the only applicable action in the “Action required” section is:
“Workers should be instructed, through training and inductions, regarding the importance of using the seatbelts provided in vehicles to reduce the impact of potential collisions.”
In my instant root cause analysis using the Root Cause Tree®, I wonder why there wasn’t a Standards, Policies, and Administrative Controls Not Used Near Root Cause. That would get me to dig more deeply into the Enforcement NI root cause.
What do you think? Was this a training root cause that needs a training corrective action?
Leave your comments below…
Once you’ve gathered all the information you need for a TapRooT® investigation, you’re ready to start with the actual root cause analysis. However, it would be cumbersome to analyze the whole incident at once (like most systems expect you to do). Therefore, we break our investigation information into logical groups of information, called Causal Factor groups. So the first step here is to find Causal Factors.
Remember, a Causal Factor is nothing more than a mistake or an equipment failure that, if corrected, could have prevented the incident from happening (or at least made it less severe). So we’re looking for these mistakes or failures on our SnapCharT®. They often pop right off the page at you, but sometimes you need to look a little harder. One way to make Causal Factor identification easier is to think of these mistakes as failed or inappropriately applied Safeguards. Therefore, we can use a Safeguard Analysis to identify our Causal Factors.
There are just a few steps required to do this:
First, identify your Hazards, your Targets, and any Safeguards that were there, or should have been there.
Now, look for:
- an error that allowed a Hazard that shouldn’t have been there, or was larger than it should have been;
- an error that allowed a Safeguard to be missing;
- an error that allowed a Safeguard to fail;
- an error that allowed the Target to get too close to a Hazard; or
- an error that allowed the Incident to become worse after it occurred.
These errors are most likely your Causal Factors.
Let’s look at an example. It’s actually not a full Incident, but a VERY near miss. This video is a little scary!
Let’s say we’ve collected all of our evidence, and the following SnapCharT is what we’ve found. NOTE: THIS IS NOT A REAL INVESTIGATION! I’m sure there is a LOT more info that I would normally gather, but let’s use this as an example on how to find Causal Factors. We’ll assume this is all the information we need here.
Now, we can identify the Hazards, Targets, and Safeguards:
|Pedestrians (they could have stayed off the tracks)|
Using the error questions above, we can see that:
- An error allowed the Hazard to be too large (the train was speeding)
- An error allowed the Targets to get too close to the Hazard (the Pedestrians decided to go through the fence, putting them almost in contact with the Hazard)
These 2 errors are our Causal Factors, and would be identified like this:
We can now move on to our root cause analysis to understand the human performance factors that lead to this nearly tragic Incident.
Causal Factors are an important tool that allow TapRooT® to quickly and accurately identify root causes to Incidents. Using Safeguard Analysis can make finding Causal Factors much simpler.
Sign up to receive tips like these in your inbox every Tuesday. Email Barb at firstname.lastname@example.org and ask her to subscribe you to the TapRooT® Friends & Experts eNewsletter – a great resource for refreshing your TapRooT® skills and career development.
I know how this works. You get the notification that “something bad” happened, and you are assigned to perform a root cause analysis. Your initial reaction is, “There goes the rest of my week!”
However, there is no reason that a relatively simple analysis needs to take an inordinate amount of time. There are several things you can do to make sure that you can efficiently conduct the investigation, find solid root causes, and implement effective corrective actions. Here are a few ideas to help you make the process as smooth as possible.
1. The first thing that needs to be in place is a Detailed Investigation Policy for your company. When does a RCA need to be performed? What types of problems trigger an RCA? What is the decision-making chain of command? Who makes the notifications? Who is notified? Who will be on the team? All of these questions need to be easily answered in order to quickly get the process started. I have seen investigators receive notification of a problem over a week after the actual incident. By this time, evidence has been lost, key players are no longer available, and peoples’ memories have faded. All of this makes the investigation just that much harder. If you can streamline this initial decision-making and notification process so that the investigation can start within hours, you’ll find the actual investigation goes MUCH more smoothly.
2. Probably the biggest timesaver is to Be Proficient in the TapRooT® Process. We recommend you use TapRooT® at least once per month to maintain proficiency in the system. You can’t be good at anything if you only use it sparingly. I often hear people tell me, “Luckily, we don’t have enough incidents to use TapRooT® more than once per year.” Imagine if I asked you to put together an Excel spreadsheet using pivot tables, and you haven’t opened Excel since 2014! You’d have to relearn some key concepts, slowing you down. The same is true of an investigation process. If you only do an investigation once each year, you aren’t looking very hard for incidents. I’ll guarantee there are plenty of things that need to be analyzed. Each analysis makes you that much better at the process. Maybe go back to point #1 above and update your investigation trigger points.
3. When you actually get started on an investigation, the first thing you should do is Start A Spring SnapCharT®. This initial chart gets your investigator juices flowing. It helps you think about the timeline of the incident, identifying holes in your knowledge and questions you need to ask in order to fill those holes. It is the first step in the process. As soon as you get that initial phone call, start building your SnapCharT®!
4. Finally, although it is optional, The TapRooT® Software can really speed up your analysis. The SnapCharT® tool is extremely user friendly, and the Root Cause Dictionary is only a right-click away. It guides you through the investigation process so you don’t have to try to remember where you’re going.
You won’t perform an investigation in 5 minutes. However, by following these tips, you relatively quickly and efficiently move through the process, with terrific results.
To learn more about learning all of the essential techniques to perform a root cause investigation, read about our 2-Day TapRooT® Incident Investigation and Root Cause Analysis Course.
People are often surprised when they learn the reasons they haven’t taken root cause analysis training are invalid. Here are the top three excuses people give that are wrong:
1. Most employers aren’t seeking that skill when hiring.
Root cause analysis is a top skill valued by employers because mistakes don’t “just happen” but can be traced to well-defined causal factors that can be corrected. A bonus to root cause analysis training is that root causes identified over time across multiple occurrences can be used for proactive improvement. For example, if a significant number of investigations point to confusing or incomplete SPAC (Standards, Policies, or Admin Controls), improvement of this management system can begin. Trending of root causes allows development of systematic improvements as well as evaluation of the impact of corrective actions. What boss doesn’t appreciate an employee who can prevent HUGE problems and losses from occurring? Promoting your root cause analysis skills is an impressive topic of conversation on any job interview.
2. It takes too long to learn enough to really use it on my job.
In just 2 days you can learn all of the essentials to conduct a root cause analysis and add this impressive skill to your resume. You will be equipped to find and fix the root causes of incidents, accidents, quality problems, near-misses, operational errors, hospital sentinel events and other types of problems. The essential TapRooT® Techniques include:
- SnapCharT® – a simple, visual technique for collecting and organizing information to understand what happened.
- Root Cause Tree® – a systematic, repeatable way to find the root causes of human performance and equipment problems — the Root Cause Tree® helps investigators see beyond their current knowledge.
- Corrective Action Helper® – help lead investigators “outside the box” to develop effective corrective actions.
There are all kinds of training programs you can enroll in for your career development that take months, even years, to complete. A 2-day investment for this valuable training program will equip you with a powerful skill that will set you apart from the rest.
3. I don’t have enough technical knowledge to take training like that.
It doesn’t matter if you have a high school diploma or an MBA. It doesn’t matter if you do not know much about root cause analysis beyond the description provided below. Our attendees, at every level of education and technical skill, find that they can engage in the training and take away root cause analysis skills to implement immediately. It is not a “sit and listen” training – attendees do hands on exercises to develop their new knowledge in the course.
Root cause analysis is a systematic process used in investigating and fixing the causes of major accidents, everyday incidents, minor near-misses, quality issues, human errors, maintenance problems, medical mistakes, productivity issues, manufacturing mistakes and environmental releases.
Root cause analysis training provides:
- the knowledge to identify what, how and why something happened, and this knowledge is vital to preventing it from happening again.
- the understanding that root causes are identifiable and can be managed with corrective actions.
- an ease of data collection, root cause identification, and corrective action recommendations and implementation.
Still not convinced root cause analysis training is for you?
GUARANTEE for the 2-Day TapRooT® Incident Investigation and Root Cause Analysis Course: Attend this course, go back to work, and use what you have learned to analyze accidents, incidents, near-misses, equipment failures, operating issues, or quality problems. If you don’t find root causes that you previously would have overlooked and if you and your management don’t agree that the corrective actions that you recommend are much more effective, just return your course materials/software and we will refund the entire course fee.
CLICK HERE to register for the 2-Day TapRooT® Incident Investigation and Root Cause Analysis Course.
You have established a good performance improvement program, supported by performing solid incident investigations. Your teams are finding good root causes, and your corrective action program is tracking through to completion. But you still seem to be seeing more repeat issues than you expect. What could be the problem?
We find many companies are doing a great job using TapRooT® to find and correct the root causes discovered during their investigations. But many companies are skipping over the Generic Cause Analysis portion of the investigation process. While fixing the individual root causes are likely to prevent that particular issue from happening again, allowing generic causes to fester can sometimes cause similar problems to pop up in unexpected areas.
6 Reasons to Look for Generic Root Causes
Here are 6 reasons to conduct a generic cause analysis on your investigation results:
1. The same incident occurs again at another facility.
2. Your annual review shows the same root cause from several incident investigations.
3. Your audits show recurrence of the same behavior issues.
4. You apply the same corrective action over and over.
5. Similar incidents occur in different departments.
6. The same Causal Factor keeps showing up.
These indicators point to the need to look deeper for generic causes. These generic issues are allowing similar root causes and causal factors to show up in seemingly unrelated incidents. When management is reviewing incident reports and audit findings, one of your checklist items should be to verify that generic causes were considered and either addressed or verified not to be present. Take a look at how your incident review checklist and make sure you are conducting a generic cause analysis during the investigation.
Finding and correcting generic causes are basically a freebie; you’ve already performed the investigation and root cause analysis. There is no reason not to take a few extra minutes and verify that you are fully addressing any generic issues.
United grounds all of their flights for two hours due to “computer problems” (see the CNBC story).
The NYSE stops trading for over three hours due to an “internal technical issue” (see the CNBC story).
Computer issues can cost companies big bucks and cause public relations headaches. Do you think they should be applying state of the art root cause analysis tools both reactively and proactively to prevent and avoid future problems?
TapRooT® has been used to improve computer reliability and security by performing root cause analysis of computer/IT related events and developing effective corrective actions. The first TapRooT® uses for computer/high reliability network problems where banking and communication service providers that started using TapRooT® in the late 1990’s. The first computer security application of TapRooT® that we knew about was in the early 2000s.
Need to improve your root cause analysis of computer and IT issues? Attend one of our TapRooT® Root Cause Analysis Courses. See the upcoming course schedule at:
The 22-year-old man died in hospital after the accident at a plant in Baunatal, 100km north of Frankfurt. He was working as part of a team of contractors installing the robot when it grabbed him, according to the German car manufacturer. Volkswagen’s Heiko Hillwig said it seemed that human error was to blame.
A worker grabs the wrong thing and often gets asked, “what were you thinking?” A robot picks up the wrong thing and we start looking for root causes.
Read the article below to learn more about the fatality and ask why would we not always look for root causes once we identify the actions that occurred?
“Doctor… how do you know that the medicine you prescribed him fixed the problem,” the peer asked. “The patient did not come back,” said the doctor.
No matter what the industry and or if the root causes found for an issue was accurate, the medicine can be worse than the bite. Some companies have a formal Management of Change Process or a Design of Experiment Method that they use when adding new actions. On the other extreme, some use the Trial and Error Method… with a little bit of… this is good enough and they will tell us if it doesn’t work.
You can use the formal methods listed above or it can be as simple for some risks to just review with the right people present before implementation of an action occurs. We teach to review for unintended consequences during the creation of and after the implementation of corrective or preventative actions in our 7 Step TapRooT® Root Cause Analysis Process. This task comes with four basic rules first:
1. Remove the risk/hazard or persons from the risk/hazard first if possible. After all, one does not need to train somebody to work safer or provide better tools for the task, if the task and hazard is removed completely. (We teach Safeguard Analysis to help with this step)
2. Have the right people involved throughout the creation of, implementation of and during the review of the corrective or preventative action. Identify any person who has impact on the action, owns the action or will be impacted by the change, to include process experts. (Hint, it is okay to use outside sources too.)
3. Never forget or lose sight of why you are implementing a corrective or preventative action. In our analysis process you must identify the action or inaction (behavior of a person, equipment or process) and each behaviors’ root causes. It is these root causes that must be fixed or mitigated for, in order for the behaviors to go away or me changed. Focus is key here!
4. Plan an immediate observation to the change once it is implemented and a long term audit to ensure the change sustained.
Simple… yes? Maybe? Feel free to post your examples and thoughts.
We can all remember some type of major product recall that affected us in the past (tires, brakes, medicine….) or recalls that may be impacting us today (air bags). These recalls all have a major theme, a company made something and somebody got hurt or worse. This is a theme of “them verses those” perception.
Now stop and ask, when is the last time quality and safety was discussed as one topic in your current company’s operations?
You received a defective tool or product….
- You issued a defective tool or product….
- A customer complained….
- A customer was hurt….
Each of the occurrences above often triggers an owner for each type of problem:
- The supplier…
- The vendor…
- The contractor…
- The manufacturer….
- The end user….
Now stop and ask, who would investigate each type of problem? What tools would each group use to investigate? What are their expertise and experiences in investigation, evidence collection, root cause analysis, corrective action development or corrective action implementation?
This is where we create our own internal silo’s for problem solving; each problem often has it’s own department as listed in the company’s organizational chart:
- Customer Service (Quality)
- Manufacturing (Quality or Engineering)
- Supplier Management (Supply or Quality)
- EHS (Safety)
- Risk (Quality)
- Compliance (?)
The investigations then take the shape of the tools and experiences of those departments training and experiences.
Does anyone besides me see a problem or an opportunity here?
On May 5, 1988, one of United States’ worst oil refinery explosions occurred in Norco, Louisiana. There were six employees that were killed and 42 local residents injured. The blast was said to have reached up to 3o miles away shattering windows, lifting roofs and sending a black fog over the entire town of Norco. Residents were forced to evacuate while officials died the fires down and gathered as much rubble as possible to recover any bodies. In order to discover the root cause of this disaster, the Federal Occupational Health and Safety Administration as well as the Environment Protection Agency came and investigated the scene to gather information. The only possible root cause they could find was the catalytic cracking unit, machine used to break down crude oil into gasoline, because it was at the center of the explosion, but there was no definite cause found. Overall, the amount of damage done cost Shell millions of dollars and set an incredible amount of fear into the residents.
I was at a conference yesterday and one of the talks was about advanced root cause analysis. The presenter’s company had their own “home grown” root cause analysis system and they discovered that they were not getting consistent results. Improvement was needed!
They studied their system and discovered something that was missing – management system causes. In the TapRooT® System we have called these “Generic Causes” since we copyrighted the first TapRooT® manual in 1991.
It made me think … Why did they wait 24 years to discover something we’ve known about since before 1991?
Next, I talked with an engineer who had been trained in a common cause and effect system. He wasn’t too pleased with the results he was getting. He wanted to know how TapRooT® could help. Was it different?
I shared how TapRooT® works (see this LINK for the explanation) and it took quite a bit of effort to get beyond the cause and effect model that he thoroughly understood so that he could understand why he was missing things. He was really smart. He asked very insightful questions. He latched onto the reasons that the less systematic cause and effect analysis led to inconsistent results. He saw how TapRooT® could help investigators go beyond their paradigm and get consistent results.
By the end of this second conversation I started thinking … How did we get so far ahead of common root cause systems?
I think I know the answer.
It starts with the Human Factors training that I received at the University of Illinois. It really showed me how to think about human centered design – including designing a root cause analysis system that people could use consistently.
Second, I was fortunate enough to work in the Nuclear Navy where there was an excellent process safety culture and for Du Pont where there was an excellent industrial safety culture. This helped me see how management systems made a difference to performance. (My boss and I at Du Pont actually coined the phrase “Management System” that is now commonly used throughout industry.)
Third, I was well trained by my mentor at the University of Illinois, Dr. Charles O. Hopkins, how to do applied research. So the research I did studying root cause analysis in the mid-1980’s and early 1990’s really paid off when we created the TapRooT® System.
Fourth, we had a really good team that brought out the best in each other during the early development.
Next, we were lucky to have some excellent clients in the nuclear, oil, and aviation industries that were great early adopters and provided excellent feedback that we used to quickly improve TapRooT® root cause analysis in the early and mid-1990’s.
Finally, I made friends with and/or listened to many industry gurus who were experts in safety, process safety, quality, and equipment reliability. Their influence was built into TapRooT® and helped it be a world-class system even in it’s early stages. These experts included:
- Jerry Ledderer, aviation safety pioneer
- Dr. Charles O. Hopkins, human factors pioneer
- Smoke Price, human factors expert
- Larry Minnick, nuclear safety expert
- Rod Satterfield, nuclear safety expert
- Dr. Alan Swain, human reliability expert
- Heinz Bloch, equipment reliability expert
- Admiral Hyman Rickover, father of the Nuclear Navy and process safety expert
- Dr. Christopher Wickens, human factors expert
- Dr. Jens Rassmussen, system reliability and human factors expert
- W. Edwards Deming, quality management guru
- Admiral Dennis Wilkerson, first CO of the Nautilus and first CEO of INPO
That’s quite a list and I was lucky to be influenced by each of these great men. Their influence made TapRooT® root cause analysis far ahead of any other root cause tool.
So that’s why I shouldn’t be surprised that others are finally catching on to things that we knew 25 years ago. Perhaps in a century, they will catch up with the improvements we are making to TapRooT® today (with the help of thousands of users from around the world).
If you would like to learn the state-of-the-art of root cause analysis and not wait 25 to 100 years to catch up, perhaps you should attend a TapRooT® Course in the next month or two. See our course schedule for upcoming public courses at:
And get information about all the courses we offer at:
And if you would like to learn about the state of the art of performance improvement, attend the 2015 TapRooT® Summit coming up on June 1-5 in Las Vegas. Get more information and download the brochure at:
But don’t wait. Every day you wait you will be another day behind the state-of-the-art in root cause analysis and performance improvement. Don’t be left behind!
TapRooT® Root Cause Analysis
Changing the Way the World Solves Problems
Caution: Watching this Video can and will make you laugh…… then you realize you might be laughing at…
… your own actions.
… your understanding of other peoples actions.
… your past corrective or preventative actions.
Whether your role or passion is in safety, operations, quality, or finance…. “quality is about people and not product.” Interestingly enough, many people have not heard Dr. Deming’s concepts or listened to Dr. Deming talk. Yet his thoughts may help you understand the difference between people not doing their best and the best the process and management will all to be produced.
To learn more about quality process thoughts and how TapRooT® can integrate with your frontline activities to sustain company performance excellence, join a panel of Best Practice Presenters in our TapRooT® Summit Track 2015 this June in Las Vegas. A Summit Week that reminds you that learning and people are your most vital variables to success and safety.
To learn more about our Summit Track please go to this link. https://www.taproot.com/taproot-summit
If you have trouble getting access to the video, you can also use this link http://youtu.be/mCkTy-RUNbw
There they go again. HUMAN ERROR as a root cause.
Haven’t they read my article at:
Human error is a symptom, not the root cause.
Attend a TapRooT® Course and find out how you can find and fix the real causes of human error.
When using TapRooT®, most of the terms are pretty self-explanatory. TapRooT® is pretty easy to understand and use. However, there are a few terms that we use that may be a little different than those you might be used to. I thought I’d give a few definitions to help make things just a little bit clearer.
Root Cause Tree®: This is the heart of the TapRooT® system. It is contains the guidance and the root causes needed by the investigator.
Root Cause Dictionary®: Contains a list of bulleted yes/no questions that guide your investigator through the Root Cause Tree®.
SnapCharT®: This is a visual representation of the investigation. It is used to document the evidence you find during your investigation, allows you to identify Causal Factors, and is used with the Root Cause Tree® during the analysis. It contains the Incident, Event, and Condition shapes.
Incident: This is the reason you are performing the investigation. It is the problem that lead you to start your TapRooT® process. It is a circle on your SnapCharT®.
Event: An action performed by someone or a piece of equipment. They are arranged in chronological order as rectangles on the SnapCharT®.
Condition: A piece of information that describes the Event that it is attached to. Represented by an oval on the SnapCharT®.
Root Cause: The absence of best practices or the failure to apply knowledge that would have prevented the problem (or significantly reduced the likelihood or consequences of the problem).
Causal Factor: Mistake or failure that, if corrected, could have prevented the Incident from occurring, or would have significantly mitigated its consequences.
Generic Cause: A systemic problem that allows a root cause to exist.