Category: Root Causes
Monday Accident & Lessons Learned: Root Cause Analysis of San Onofre Nuclear Generating Station, Unit 2 & 3 Replacement Steam GeneratorsApril 1st, 2013 by Mark Paradies
This certainly sounds like an expensive incident and you would think they would use a state of the art root cause analysis system. Instead, they used cause-and-effect. See the report and see if you agree that the “root causes” of the incident are:
I think the lessons learned is to get a better root cause analysis tool!
Here’s the Meridian-Webster On-line Dictionary definition of “behavior”:
1. a : the manner of conducting oneself
b : anything that an organism does involving action and response to stimulation
c : the response of an individual, group, or species to its environment
2 : the way in which someone behaves; also : an instance of such behavior
3 : the way in which something functions or operates
Another definition that I think that management has in their heads is a “behavior” is:
“Any action or decision that an employee makes that management,
after the fact, decides was wrong.”
Why do I say that mangement uses this definition? Because I often hear about managers blaming the employee’s bad behavior for an accident.
For example, the employee was hurrying to get a job done and makes a mistake. That’s bad behavior!
What if an employee doesn’t hurry? Well, we yell at them to get going!
And what if they hurry and get the job done without an accident? We reward them for being efficient and a “go-getter.”
Management doesn’t usually see their role in making a “behavior” happen.
Behavior should NEVER be the end of a root cause analysis. Behavior is a fact. Just like a failed engine is a fact when a race car “blows it’s engine.”
Of course, a good root cause analysis should look into the causes for a behavior (a mistake) and uncover the reasons for the mistake and, if applicable, the controls that management has over behavior and how those controls failed when an accident occurred.
A bad decision or a human error that we call a “behavior” isn’t the end of the investigation … it is just the beginning!
TapRooT® helps investigator go beyond the symptoms (the behaviors) and find the root causes that management can fix. Some of the most difficult behaviors to fix are those so ingrained in the organization that people can’t see any other way to work.
For example, the culture of cost saving/cutting at BP was so ingrained, that even after the explosions and deaths at the Texas City Refinery, BP didn’t (couldn’t?) change it’s culture – at least not in the Gulf of Mexico exploration division – before they had the Deepwater Horizon accident. At least that is what I see in the reports and testimony that I’ve reviewed after the accident.
And with smaller incidents, it is even harder to get some managers’ attention and show them how they are shaping behavior. But at least in TapRooT® tries by providing guidance in analyzing human errors that leads to true root causes (not just symptoms).
Want to find out more about TapRooT® and behavior? Attend one of our 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Courses. You’ll see how TapRooT® helps you analyze behavior issues in the exercises on the second day of the training. And you will learn much more. For a public 5-Day Course near you see:
Why Do People Have Problems Finding Root Causes? Read this Article – Under Scrutiny – from Quality Progress…February 25th, 2013 by Mark Paradies
Do you have problems finding the root causes of quality problems, safety incidents, or mechanical failures? It could be becuse of the root cause analysis tools you have chosen to use. Some tools have inherent weaknesses that are “built in.”
The article attached below (as first appeared in Quality Progress, the flagship magazine of the quality professional society ASQ), explains why some techniques commonly recommended for root cause analysis (like 5 Whys) will cause problems when applied by people in the field.
(click the link above to download the article)
Once you finished reading about the limitations of 5-Whys and Cause-and-Effect, sign up to learn about the advanced root cause analysis system that was intelligently designed to avoid those problems … TapRooT®.
The best way to keep your Valentine’s Day romantic and fun? Make food safety a priority!
A recent article on StateFoodSafety.com notes that the best restaurant to eat in on Valentine’s Day is a clean one. Here are a few of their food safety tips this Valentine’s Day:
- Take note of the dining area and restrooms. If they do not meet cleanliness standards, it’s probably a good sign that the kitchen is also in need of more than just a light dusting. You might consider eating elsewhere for your own safety.
- Only eat foods that are served to you hot. If the food is served to you at a lukewarm temperature, chances are that it was left sitting for too long and has allowed harmful bacteria to multiply.
- Make sure the staff does not touch your food or the tips of your silverware with their bare hands. It’s probably not a good idea to let them sample your drink either.
- Be wary of meat, eggs, oysters, or other raw foods that are undercooked.
- Wash your hands properly before and after eating.
Photo courtesy of NPR.
Based on client’s request, we have scheduled our ONLY Public India 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training for April 22 – April 26.
For those not familiar with the course, it includes the TapRooT® single user software (unless attendee’s company has a network software license), TapRoot® book, Corrective Action Helper®, Root Cause Dictionary & Laminated Root Cause Tree, Course Workbook.
Course Fee which includes a software individual license for each student is only $2,395 USD. Here is the registration link: Register
Please register 30 days prior to the course if you need a quote first to send to your billing department. Anything within 30 days or less must be paid for during registration. All course seats must be paid for prior to the course to hold the seat and attend the course.
We look forward to seeing our repeat clients and new clients in our only 5-Day public India course for 2013.
With many industries and natural resources located in Trinidad, System Improvements, Inc. teaches many onsite TapRooT® Root Cause Analysis Courses. In fact, I will be teaching a 3-Day TapRooT®/Equifactor® Equipment Troubleshooting & Root Cause Failure Analysis this November with contract instructor, Mark Olson. I will be scheduling another 5 Day Course in Trinidad this the summer of 2013.
We get so busy sometimes in performing root cause analysis facilitations, courses and just plain business, that it is nice to see a reminder about why we want to make all industries safer. Pictured above is Safraz Ali, a student from the Trinidad Course, whom I had the opportunity to meet with his family.
Family, friends and the community are why we love what we do when we get it right!
Valerie Johnson is now certifed to teach the TapRooT® 2-Day Course to the ConocoPhillips Aviation Division. Valerie flew in from Alaska to Houston to get trained and upon return will be co-teaching with long time certified instructor Michael Rodriguez.
As a Senior Associate with System Improvements, Inc. with 18 years in aviation, it was a pleasure to teach the course in the Aviation Hangar offices. David Camille, also pictured above, was instrumental in coordinating this course and giving me the tour of one of their Gulfstreams.
R.R. Donnelley & Sons Co. prematurely filed Google Inc.’s earnings report with the Securities and Exchange Commission on Thursday. Google’s earnings were supposed to be released after the stock markets closed at 3 p.m. Instead, they showed up on the SEC’s Edgar website about 11:30 a.m. Google’s stock dropped as much as 11 percent, to $676 a share, before trading was halted about 20 minutes later at the company’s request.
About an hour after the earnings release, Google issued a statement blaming Chicago-based R.R. Donnelley for the blunder.
(“Glitch on Google Earnings Report under Investigation,” Chicago Tribune, October 19, 2012.)
Why do people think that blame will stop incidents? Haven’t we tried that already? Don’t the incidents just continue? Share your comments below.
Mark Paradies and Linda Unger attended the Budapest Conference on EHS in Emerging Markets.
Mark gave a talk: Solving Root Cause Analysis Problems by Using Advanced Root Cause Analysis. Here’s some pictures of Mark Speaking …
Linda talked to prospective TapRooT® Root Cause Analysis System users and explained how they could learn about and implement TapRooT® at their sites across Europe.
TapRooT® Instructor, Michele Lindsay, answers a great question from one of the attendees of the 2-day TapRooT® Root Cause Analysis training course held at the 2012 Global TapRooT® Summit in Las Vegas, Nevada.
Question from attendee:
I am presently using the techniques I learned to conduct my own RCA of the same incident we presented during the class and I had a question:
After grouping the conditions under each causal factor and working your way through the RCA tree on each causal factor, are you to only use the conditions grouped under that particular causal factor or are you allowed to use a condition that was grouped under a different causal factor?
My understanding is that you are to only use the conditions grouped under that specific causal factor and not reach out to other conditions from other “groups” as supporting evidence for the RCA. I found during the class that the practice of using other conditions from other groups as supporting evidence to say either “yes” or “no” was occurring very often and that puzzled/troubled me. In my opinion, if you were allowed to reach for other conditions not grouped under the causal factor in question, this negates the purpose of grouping conditions in the first place. Am I wrong in my understanding of the purpose for grouping conditions?
You are quite right that once conditions are grouped and Causal Factors identified, you really should stay within your “grouping” as you work through the RCA process.
– if the condition was put in under one Causal Factor, but applies better to the another, consider moving the condition (the “so What” test helps with this) or put it both places if it applies in both. Theoretically, if it supports a Root Cause, the condition should be associated with that Causal Factor.
– if one is wandering out of a Causal Factor and “poaching” conditions to support the current Causal Factor being analyzed, when you read the question from the dictionary that you want to answer “yes” to, then add “and that’s why … (then insert Causal Factor here). If the answer is “no,” you are wandering off, if “yes,” move the condition over, if “no,” then don’t select the Root Cause.
If a team does wander off and poach conditions from another Causal Factor, you may see duplicated Root Cause, for the same reason (answered “yes” to the same question) in 2 RCA for different Causal Factors. During your sanity check at end of the process you should catch it.
So the purist in me agrees with your conclusion, but the tool is robust enough to handle good use and weaker use.
Thanks for sending in the question and answer to share with others, Michele!
What if you had a system with two regular power supplies, two back-up power supplies (diesels), and a battery back up with a separate diesel to keep it charged?
Wow! This should be highly reliable right?
Read about how this system failed here:
Now here’s the question …
What did they miss in their “root cause analysis”?
I think they had great troubleshooting.
They even had actions to address generic problems.
But I don’t think they found the root causes of the “cloud failure” incident.
What do you think? Leave your comments here…
We recently distributed the Root Cause Network Newsletter which included many interesting hot topics:
Major Accident Types that Produce Fatalities on the Job
Errors: Looking for Blame or Opportunities?
Energy Safeguards Target
Fastest Growing LinkedIn Root Cause Analysis Group
Can We Agree on a Worldwide Definition of “Root Cause”?
View and download a copy of the June Root Cause Network Newsletter.
There is an interesting article in the June 2012 edition of Maintenance Technology about Kübler-Ross concepts and management response to root cause analysis reports. You may be familiar with Kübler-Ross’s book, “On Death and Dying,” where she introduced the “Five Stages of Grief” concept. Randall Noon, the author of the Maintenance Technology article, compared these stages to the stages a committee reviewing a root cause analysis report moves through when a serious problem is uncovered. Noon writes:
“The committee typically includes at least some managers whose departments were involved in the adverse event. Some of them may even have made decisions that set up conditions for the event, exacerbated its consequences or directly caused it. Some might have had an opportunity to prevent the event, but didn’t act. Thus, the committee isn’t impartial: It’s like a patient with a stake in his/her doctor’s diagnosis of a serious condition.”
Read the article: Kübler-Ross And Root-Cause Evaluations
Could the initial rejection of a root cause analysis report mean that the committee just needs more time to assimilate the findings on their own terms? Tell us what you think.
Mark Paradies spoke at the IIE Conference about the “7 Secrets of Root Cause Analysis” this week. The Industrial Engineers present were very interested in going beyond common problem solving tools like 5-Whys and Cause and Effect and asked some great questions.
To see the paper the talk is based on, CLICK HERE.
Is your Root Cause Analysis program doing the job? Ever wondered how you stack up to others? Go online now and start the process of benchmarking your program.
The Good, the Bad, the Ugly: a comparative analysis from the creators of TapRooT®.
Where does your Root Cause Analysis program rank? Let us measure your program against hundreds of others. You’ll see how you compare to others in the following areas:
Corrective action effectiveness
Staff knowledge of root cause analysis and performance techniques and more …
Get your FREE comparative analysis now. It only takes minutes to start the process.
Just access our special website to start the process to benchmark your program against others. It’s fast. It’s free. And it’s a valuable way to validate your program or identify areas for improvements.
Take the survey at this link …
Here is an accident report from the BSEE (Bureau of Safety & Environmental Enforcement of the US Department of the Interior):
How many times have you seen similar accidents with unprotected holes on construction sites, oil platforms, or in other locations with work that makes “temporary” openings?
It would seem that anyone supervising work should know better.
Yet the report says that the company blamed the roustabout who fell to his death through the hole because he was, “…distracted by concern for a family issue at home.”
The report says:
“This same story that the accident was caused by a lack of concentration by a distracted Roustabout, was repeated in the initial report to BOEMRE, in interviews by Supervisor, Company Man, and by management of Alliance, and was written into the accident investigation report by Contractor and Operator. The only reason given in statements for this conclusion was that the Roustabout had spoken of it at breakfast and had tried to rearrange his shift to accommodate the family issue.”
OK TapRooT® Users, what do you think. Is “lack of concentration” a root cause? Did the company do a thorough investigation? Could they tell everyone to “be more careful” and resume work as usual? Was the BSEE right to question the adequacy of the contractor and the operator?
Read the report and let me know what you think.
I was reviewing an industry study on the causes of accidents and noticed that fatigue was nowhere on their list. Since other studies where people actually observed performance show that fatigue is a major issue in real world accidents, I wondered why fatigue did not show up on the industry sponsored list.
The easy answer is … If you don’t ASK about fatigue and look into fatigue as a potential cause, you will never find it.
That reminded me of an investigation into a barge crash. The operator couldn’t find a reason why the First Mate had gone “brain dead” and made a totally inappropriate approach to a bend in the river. It was very important to be lined up correctly because the river was running near flood stage and there was little room for error. But once he was lined up wrong, he had little choice. He tried to “power through” the turn and ended up crashing the barges into a bridge after the turn.
One of the questions I asked the investigator was, “Did you consider fatigue?” (The accident happened at about 5 AM and the tug and barges were on the second day of the trip.)
The reply was interesting … the investigator said:
“He was working a standard schedule.”
That seemed to be enough for him to dismiss fatigue as a cause.
I asked, “What is a standard schedule.” The answer, “6 on and 6 off.”
So the first mate would normally work from midnight to 6 AM, have six hours “off” to rest or work, then be back piloting from noon to 6 PM, get off, eat dinner, and go to bed and get back up to work from midnight to 6 AM again.
I asked if he knew if the First Mate had been well rested before starting the journey. The answer? “No, I didn’t ask about that.”
Even after this questioning, the investigator just couldn’t see that fatigue could be a potential cause that should be looked into. After all, the schedule was a standard industry practice.
That’s one of the reasons that I started adding sessions about fatigue to the TapRooT® Summit.
It’s also one of the reasons that we collaborated with Circadian Technologies to produce a tool for investigators to assess fatigue with a proved diagnostic tool call FACTS (Fatigue Accident/Incident Causation Testing System). (Click on the link to subscribe to the on-line system for free.)
It’s also why I recommend Circadian Technologies seminars on fatigue risk management and shift work scheduling.
If you are interested about learning more about fatigue, there are two seminars coming up that you should consider. The first is “Designing and Implementing an Effective Fatigue Risk Management System” and will be held in Salt Lake City on May 23-24. For more information, see:
The second is “Successfully Expanding from 5- to 7-Day Continuous Operation” and will be held in Chicago, IL on June 13-14. For more information, see:
We should not overlook fatigue as a potential cause. TapRooT® includes a question about fatigue as one of the 15 questions in the Human Performance Troubleshooting Guide on the front of the Root Cause Tree®. So you should consider fatigue for every human error. Ask about fatigue and perform an assessment using FACTS if there seems to be a potential for a fatigue issue. Don’t accept “standard industry practice” as ruling out fatigue as an issue.
Alan Smith (one of our UK Instructors) Presents at the IOSH Conference: “How a Fatal Accident Could Have Been Prevented Proactively By Use of TapRooT®”March 7th, 2012 by Mark Paradies
What kills more people in the US than industrial accidents, highway accidents, and airline accidents combined?
Mistakes in hospitals.
The technical term for these mistakes is “Sentinel Events.”
Estimates of the deaths caused vary. We use estimates because there are no accurate statistics on the total number of deaths caused by mistakes in hospitals. There is no national reporting requirement.
Even though there is no national reporting requirement, studies show that despite over a decade of effort to stop sentinel events, no progress is being made. Some studies actually show the problem getting worse. And this problem isn’t unique
WHY NO IMPROVEMENT
Why can’t we improve? There are a number of factors that make improvement difficult:
1. Healthcare Complexity
2. Poor Root Cause Analysis (RCA)
3. Inadequate Corrective Actions
4. Not Enough Management Attention
We will review all of these factors and what we can do about them in the following sections.
Medical practice keeps getting more complex. More complex technology. More drugs with more interactions. More pressure to work faster and be more efficient. The result? More chances to make errors with catastrophic consequences. At the same time, downsizing means less staff to catch errors.
Healthcare complexity calls for increased, proactive application of system reliability and human factors solutions to improve health¬care delivery. Intelligent, resilient design can make complex systems reliable. Plus, staffing needs to be assessed to ensure adequate coverage to apply error-catching activities.
POOR ROOT CAUSE ANALYSIS
After a decade of using RCA to analyze sentinel events, the lack of progress indicates a failure of healthcare root cause analysis.
What’s wrong? A majority of healthcare facilities use inadequate RCA systems including fishbone diagrams, 5-Whys, and healthcare derived root cause checklists. These “simple” techniques are inadequate to analyze complex healthcare sentinel events.
Not only are the RCA systems inadequate, the RCA training is also inadequate. People are assigned to investigate healthcare sentinel events with little or no training. They are lucky to attend a free one to eight hour session provided at a professional society meeting or sponsored by an insurance provider.
But healthcare investigators face another factor that makes root cause analysis even more difficult: BLAME. More than your everyday blame that comes with every accident. Medical malpractice seems designed to make people less open – less willing to cooperate wholeheartedly with investigators.
Furthermore, doctors who are independent contractors are naturally suspicious of investigators who seem to question their judgment and put their credentials at risk. Is it any wonder that we haven’t made progress?
Despite some of the factors that are difficult to address, picking an advanced root cause analysis system and getting people trained shouldn’t be hard. After all, there is TapRooT®!
The TapRooT® System was designed to be used for simple and complex investigations. It has been applied successfully in healthcare settings and has improved performance of complex systems. The 2-Day and 5-Day TapRooT® Courses have been customized for on-site training of healthcare investigators to help them with demanding investigations. Problems solved!
POOR CORRECTIVE ACTIONS
Inadequate root cause analysis is just the start. Typically, we see the weakest corrective actions applied to prevent repeat sentinel events.
Those familiar with the terminology “hierarchy of controls” applied in industrial and process safety may know what I am pointing out. Healthcare corrective actions often include the application of new standards that depend on human reliability. When these fail, investigators recommend some of the “re” corrective actions, including: re-train, re-mind, and re-emphasize (discipline).
But these are the weakest possible corrective actions (see pages 127 -129 in your 2008 TapRooT® Book.) More effective corrective actions include another type of “re” corrective action. Removing the hazard or the target. Or, re-engineering the process to improve system reliability and decrease human error without adding additional tasks for people to cope with.
These types of corrective actions and more are the result of a TapRooT® investigation when investigators apply the suggestions in the Corrective Action Helper® and apply Safeguards Analysis as part of the development of their solutions.
One might say that the cause of all the previous problems is inadequate management attention to performance improvement at healthcare facilities. Part of this inattention can probably be attributed to the fact that most healthcare administrators aren’t trained in advanced performance improvement techniques. Even the few who have had Six Sigma training don’t know about advanced root cause analysis and, therefore, don’t know about the action they could take to make performance improvement happen.
Plus, hospital administrators need to become more involved in the analysis, review, and approval of sentinel event investigations. Involvement can bring them face-to-face with the challenges people are experiencing in the field. Trained managers reviewing a SnapCharT® can see beyond blame to real action to improve performance. They can see their contribution to errors that come from understaffing and fatigue. They can become a knowledgeable part of the team fighting sentinel events.
SIMPLE PLAN TO IMPROVE
Each day, hundreds of lives are lost because we haven’t won the battle to defeat sentinel events. Don’t wait for the entire healthcare industry to wake up to the problems and solutions. Don’t wait for regulatory requirements to force your facility into action. Start today with the tools that are at hand.
1. Bring the message to management. Get them involved. They should feel that EVERY sentinel event at their facility is a personal failure to address the causes!
2. Adopt an advanced root cause analysis system – TapRooT® – including the latest root cause analysis software and database to help you learn from small incidents to prevent major sentinel events.
3. Get the training that your facility needs in root cause analysis. This includes training for hospital administrators, staff, and your performance improvement experts.
Start with a customized 2-Day TapRooT® Course for senior management. Follow that with a 2-Day TapRooT® Course for those who are frequently involved in sentinel event investigations and a 5-Day TapRooT® Course for those who facilitate sentinel event investigations.
4. Once you complete steps 1-3, you are ready to start continuous improvement efforts. Start by attending the TapRooT® Summit healthcare track to find out what other leaders in the field of healthcare are doing to continue improvement efforts.
Don’t wait. People are dying waiting for improvement to occur. Start today!
(Reprinted by permission from the February Root Cause Network™ Newsletter, Copyright © February, 2012)
As an ex-aircraft mechanic and a “sometimes gotta work on my own car” mechanic, I have in the past borrowed or made some of the tools pictured below. The questions remain:
Bad Access by Design?
Or a little bit of them all?
Finally, ever have one of your modified tools bite you back? Share your stories in the comment section.
The picture above is from a airport jet bridge in Frankfurt, Germany.
If you look at the ground level you can just make out the wheels that carry a very heavy load.
You might also notice that they have a guard to keep people away.
Why did I notice this?
Because last year at the Knoxville airport a Delta employee was run over by these wheels. It totally crushed one leg.
There was no guard when the accident happened. Instead, Delta had a policy that all employees should be clear before the jet bridge was moved and stay clear while in motion.
Obviously, this administrative control (SPAC in TapRooT® lingo) failed (SPAC Not Used).
However, a physical guard might be a better safeguard than an administrative control.
Next time I get a chance I will have to see if the corrective action from the Knoxville accident was to add guards on the Knoxville jet bridges.