Author Archives: Mark Paradies
A failure occurs. It could be:
- a safety related accident
- an equipment failure
- a patient safety event (sentinel event)
- a quality issue
- a shipping screw up
- a cost overrun
- a process safety related near-miss
What people do next can make a world of difference.
First, is the failure (incident or near-miss) reported? Or is it covered up?
If you are reading this you probably think that your company should learn from its mistakes to keep the mistakes from happening again. (Or to keep something even worse from happening – like the picture above.)
But if mistakes and failures are hidden, learning is unlikely.
People must know that it is safe to report a problem and that, once a problem is reported, something will be done to improve the process to make the problem go away.
Punishing the person who reported the problem or punishing someone else involved in the failure IS NOT the kind of action that will promote more reporting of failures.
OK … You have established a culture where the reporting of problems is not punished. You may even have a culture where the reporting of problems is an expected part of how you do business. NOW WHAT?
Do people know how to preserve the evidence of the failure so that an effective root cause analysis can be performed?
You might be surprised that most folks don’t know how to preserve the scene of an accident.
They don’t know that disassembling broken equipment may destroy the evidence of why the equipment broke.
They may not collect the names of everyone involved (including contractors and first responders).
They may “clean things up” to get back to normal housekeeping standards.
They may let vital fluid samples slip away.
They may even collect “souvenirs” to take home.
Reporting the failure really doesn’t help if the evidence of the failure is destroyed before the root cause analysis starts.
What are you doing to train your supervisors to preserve the scene of a failure?
I have two suggestions.
1. Have training for them on evidence collection and interviewing.
We have a TapRooT® Course that can help supervisors secure the scene of an accident and have a much better idea of what they need to do when responding to a failure.
The course can be customized to teach just the information that you think your supervisors need.
The complete 2-Day TapRooT® Effective Interviewing & Evidence Collection Course has essential information that supervisors need to stop evidence destruction and help conduct interviews of those involved. See the course outline at:
Barb Phillips, the course designer, will be happy to talk to you about customizing the course for your supervisors to give them the knowledge and practice that they need to be ready to effectively respond to a failure. To talk to Barb, call 865-548-8990. Or email het by using this LINK.
2. Your equipment folks need training in equipment troubleshooting and failure analysis.
We have another course designed for equipment troubleshooters to help them avoid the destruction of evidence when they respond to an equipment failure. The 2-Day Equifactor® Equipment Troubleshooting and TapRooT® Root Cause Analysis Course will help them develop a troubleshooting plan that will preserve the evidence they need to troubleshoot the problem and find the problem’s root causes.
Again, the Equifactor® Course can be customized to meet the needs of your troubleshooters. Call Ken Reed, the course creator, at 865-539-2139 to discuss ways to make your training targeted to your workforce. Or contact him by e-mail at this LINK.
Whatever you do … DON’T sit back and wait for the next accident and assume that your folks will respond appropriately. I can assure you that if hoping for the best is your strategy … you will be sadly disappointed.
Saw an interesting article in Hydrocarbon Processing titled:
That reminded me of the Amoco refineries that were sold to BP and had a horrible safety record.
Regulators should have a red flag for any assists covered under a PSM program. If they are being sold, INSPECT!
Perhaps this could stop management from excessive cost cutting pre-sale to boost the bottom line at the expense of safety and the environment.
You may have reviewed the new regulations for process safety at California refineries. This is a major change to the standard PSM rules in the USA for California refineries.
Here is the section from the “Incident Investigation” portion of the rule…
– – –
(o) Incident Investigation – Root Cause Analysis.
- The employer shall develop, implement and maintain effective written procedures for promptly investigating and reporting any incident that results in, or could reasonably have resulted in, a major incident.
- The written procedures shall include an effective method for conducting a thorough Root Cause Analysis.
- The employer shall initiate the incident investigation as promptly as possible, but no later than 48 hours following an incident. As part of the incident investigation, the employer shall conduct a Root Cause Analysis.
- The employer shall establish an Incident Investigation Team, which at a minimum shall consist of a person with expertise and experience in the process involved; a person with expertise in the employer’s Root Cause Analysis method; and a person with expertise in overseeing the investigation and analysis. The employer shall provide for employee participation pursuant to subsection (q). If the incident involved the work of a contractor, a representative of the contractor’s employees shall be included on the investigation team.
- The Incident Investigation Team shall implement the employer’s Root Cause Analysis method to determine the initiating causes of the incident. The analysis shall include an assessment of management system failures, including organizational and safety culture deficiencies.
- The Incident Investigation Team shall develop recommendations to address the findings of the Root Cause Analysis. The recommendations shall include interim measures that will prevent a recurrence or similar incident until final corrective actions can be implemented.
- The team shall prepare a written investigation report within ninety (90) calendar days of the incident. If the team demonstrates in writing that additional time is needed due to the complexity of the investigation, the team shall prepare a status report within ninety (90) calendar days of the incident and every thirty (30) calendar days thereafter until the investigation is complete. The team shall prepare a final investigation report within five (5) months of the incident.
- Investigation reports shall include:
(A) The date and time of the incident;
(B) The date and time the investigation began;
(C) A detailed description of the incident;
(D) The factors that caused or contributed to the incident, including direct causes, indirect causes and root causes, determined through the Root Cause Analysis;
(E) A list of any DMR(s), PHA(s), SPA(s), and HCA(s) that were reviewed as part of the investigation;
(F) Documentation of relevant findings from the review of DMR(s), PHA(s), SPA(s) and HCA(s);
(G) The Incident Investigation Team’s recommendations; and,
(H) Interim measures implemented by the employer.
- The employer shall implement all recommendations in accordance with subsection (x).
- The employer shall complete an HCA in a timely manner for all recommendations that result from the investigation of a major incident. The employer shall append the HCA report to the investigation report.
- Investigation reports shall be provided to and upon request, reviewed with employees whose job tasks are affected by the incident. Investigation reports shall also be made available to all operating, maintenance and other personnel, including employees of contractors where applicable, whose work assignments are within the facility where the incident occurred or whose job tasks are relevant to the incident findings. Investigation reports shall be provided to employee representatives and, where applicable, contractor employee representatives.
- Incident investigation reports shall be retained for the life of the process unit.
– – –
TapRooT® Users already find management system, organizational, and cultural related root causes or generic causes that contributed to incidents they investigate. They also know about the hierarchy of controls (part of HCA analysis) and Safeguard Analysis (part of SPA) when developing corrective actions.
TapRooT® has always been ahead of its time in finding human factors related causes of incidents. Thus, TapRooT® Root Cause Analysis fits well with the Human Factors section of the California regulation…
– – –
(s) Human Factors.
- The employer shall develop, implement and maintain an effective written Human Factors program within eighteen (18) months following the effective date of this section.
- The employer shall include a written analysis of Human Factors, where relevant, in major changes, incident investigations, PHAs, MOOCs and HCAs. The analysis shall include a description of the selected methodologies and criteria for their use.
- The employer shall assess Human Factors in existing operating and maintenance procedures and shall revise these procedures accordingly. The employer shall complete fifty (50) percent of assessments and revisions within three (3) years following the effective date of this section and one hundred (100) percent within five (5) years.
- The Human Factors analysis shall apply an effective method in evaluating the following: staffing levels; the complexity of tasks; the length of time needed to complete tasks; the level of training, experience and expertise of employees; the human-machine and human-system interface; the physical challenges of the work environment in which the task is performed; employee fatigue and other effects of shiftwork and overtime; communication systems; and the understandability and clarity of operating and maintenance procedures.
- The Human Factors analysis of process controls shall include:
(A) Error-proof mechanisms;
(B) Automatic alerts; and,
(C) Automatic system shutdowns.
- The employer shall include an assessment of Human Factors in new operating and maintenance procedures.
- The employer shall train operating and maintenance employees in the written Human Factors program.
- The employer shall provide for employee participation in the Human Factors program, pursuant to subsection (q).
- The employer shall make available and provide on request a copy of the written Human Factors program to employees and their representatives and to affected contractors, employees of contractors, and contractor employee representatives, pursuant to subsection (q).
– – –
These initial drafts of the regulation have been slightly modified at a public hearing last Fall. The modifications can be viewed at: http://www.dir.ca.gov/oshsb/documents/Process-Safety-Management-for-Petroleum-Refineries-15day.pdf
The California Occupational Safety and Health Standards Board is set to review the revisions and comments on a meeting being held after the comment period expires on March 3, 2017.
While the new rule is being modified prior to adoption, California TapRooT® Users should be happy to know that they are already using a system that helps them meet and exceed the regulation being developed.
The EPA announced in December their intention to finalize a new r Risk management Plan rule for facilities with highly hazardous chemicals. Of interest to readers of this blog, the new proposal for incident investigations requires root cause analysis using a recognized method.
Here is the proposed language:
(a) The owner or operator shall investigate each incident that:
- Resulted in a catastrophic release (including when the affected process is decommissioned or destroyed following, or as the result of, an incident); or
- Could reasonably have resulted in a catastrophic release (i.e., was a near miss).
(b) A report shall be prepared at the conclusion of the investigation. The report shall be completed within 12 months of the incident, unless the implementing agency approves, in writing, an extension of time. The report shall include:
- Date, time, and location of incident;
- A description of the incident, inchronological order, providing all relevant facts;
- The name and amount of the regulated substance involved in the release (e.g., fire, explosion, toxic gas loss of containment) or near miss and the duration of the event;
- The consequences, if any, of the incident including, but not limited to: injuries, fatalities, the number of people evacuated, the number of people sheltered in place, and the impact on the environment;
- Emergency response actions taken;
- The factors that contributed to the incident including the initiating event, direct and indirect contributing factors, and root causes. Root causes shall be determined by conducting an analysis for each incident using a recognized method; and
- Any recommendations resulting from the investigation and a schedule for addressing them.
With the new administrations halt on new regulations, I’m not sure what will happen with this modification to an existing rule … so keep an eye out for the publication in the Code of Federal Regulations.
One last note if you were wondering … TapRooT® Root Cause Analysis is a recognized method.
I’ve heard many high level managers complain that they see the same problems happen over and over again. They just can’t get people to find and fix the problems’ root causes. Why does this happen and what can management do to overcome these issues? Read on to find out.
Blame is the number one reason for bad root cause analysis.
Because people who are worried about blame don’t fully cooperate with an investigation. They don’t admit their involvement. They hold back critical information. Often this leads to mystery accidents. No one knows who was involved, what happened, or why it happened.
As Bart Simpson says:
“I didn’t do it.”
“Nobody saw me do it.”
“You can’t prove anything.”
Blame is so common that people take it for granted.
Somebody makes a mistake and what do we do? Discipline them.
If they are a contractor, we fire them. No questions asked.
And if the mistake was made by senior management? Sorry … that’s not how blame works. Blame always flows downhill. At a certain senior level management becomes blessed. Only truly horrific accidents like the Deepwater Horizon or Bhopal get senior managers fired or jailed. Then again, maybe those accidents aren’t bad enough for discipline for senior management.
Think about the biggest economic collapse in recent history – the housing collapse of 2008. What senior banker went to jail?
But be an operator and make a simple mistake like pushing the wrong button or a mechanic who doesn’t lock out a breaker while working on equipment? You may be fired or have the feds come after you to put you in jail.
Talk to Kurt Mix. He was a BP engineer who deleted a few text messages from his personal cell phone AFTER he had turned it over to the feds. He was the only person off the Deepwater Horizon who faced criminal charges. Or ask the two BP company men who represented BP on the Deepwater Horizon and faced years of criminal prosecution.
How do you stop blame and get people to cooperate with investigations? Here are two best practices.
A. Start Small …
If you are investigating near-misses that could have become major accidents and you don’t discipline people who spill the beans, people will learn to cooperate. This is especially true if you reward people for participating and develop effective fixes that make the work easier and their jobs less hazardous.
Small accidents just don’t have the same cloud of blame hanging over them so if you start small, you have a better chance of getting people to cooperate even if a blame culture has already been established.
B. Use a SnapCharT® to facilitate your investigation and report to management.
We’ve learned that using a SnapCharT® to facilitate an investigation and to show the results to management reduces the tendency to look for blame. The SnapCharT® focuses on what happened and “who did it” becomes less important.
Often, the SnapCharT® shows that there were several things that could have prevented the accident and that no one person was strictly to blame.
What is a SnapCharT®? Attend any TapRooT® Training and you will learn how to use them. See:
2. FIRST ASK WHAT NOT WHY
Ever see someone use 5-Whys to find root causes? They start with what they think is the problem and then ask “Why?” five times. Unfortunately this easy methods often leads investigators astray.
Because they should have started by asking what before they asked why.
Many investigators start asking why before they understand what happened. This causes them to jump to conclusions. They don’t gather critical evidence that may lead them to the real root causes of the problem. And they tend to focus on a single Causal Factor and miss several others that also contributed to the problem.
How do you get people to ask what instead of why?
Once again, the SnapCharT® is the best tool to get investigators focused on what happened, find the incidents details, identify all the Causal Factors and the information about each Causal Factor that the investigator needs to identify each problem’s root causes.
3. YOU MUST GO BEYOND YOUR CURRENT KNOWLEDGE
Many investigators start their investigation with a pretty good idea of the root causes they are looking for. They already know the answers. All they have to do is find the evidence that supports their hypothesis.
What happens when an investigator starts an investigation by jumping to conclusions?
They ignore evidence that is counter to their hypothesis. This problem is called a:
It has been proven in many scientific studies.
But there is an even bigger problem for investigators who think they know the answer. They often don’t have the training in human factors and equipment reliability to recognize the real root causes of each of the Causal Factors. Therefore, they only look for the root causes they know about and don’t get beyond their current knowledge.
What can you do to help investigators look beyond their current knowledge and avoid confirmation bias?
Have them use the SnapCharT® and the TapRooT® Root Cause Tree® Diagram when finding root causes. You will be amazed at the root causes your investigators discover that they previously would have overlooked.
How can your investigators learn to use the Root Cause Tree® Diagram? Once again, send them to TapRooT® Training.
The TapRooT® Root Cause Analysis System can help your investigators overcome the top 3 reasons for bad root cause analysis. And that’s not all. There are many other advantages for management and investigators (and employees) when people use TapRooT® to solve problems.
If you haven’t tried TapRooT® to solve problems, you don’t know what you are missing.
If your organization faces:
- Quality Issues
- Safety Incidents
- Repeat Equipment Failures
- Sentinel Events
- Environmental Incidents
- Cost Overruns
- Missed Schedules
- Plant Downtime
You need to be apply the best root cause analysis system: TapRooT®.
Learn more at:
And find the dates and locations for our public TapRooT® Training at:
Monday Accident & Lessons Learned: Chemical Safety Board Video of Explosion at Williams Olefins Plant in Geismar, LouisianaPosted: January 30th, 2017 in Uncategorized
We have been working hard to make TapRooT® even better. Therefore, we have NEW things to share.
We have three new books that are available and three more that will be coming out in the first quarter. They are part of the new nine book set that will all be out by the end of 2017.
To see what is available now, CLICK HERE.
We’ve been updating our TapRooT® Training. Every course has had major improvements. Of course, the new courses include the new books, but there is much more that’s been improved to make TapRooT® easier to use and more effective. To find out more about our TapRooT® Courses, CLICK HERE.
Have you had a look at our new Version VI TapRooT® Software? It’s cloud-based and is device independent. Use it on your PC, Mac, or any tablet. CLICK HERE for more info.
IMPROVE YOUR ROOT CAUSE ANALYSIS BY USING THE LATEST TECHNOLOGY
The old TapRooT® Books, training, and software were good. The NEW TapRooT® Books, Training, and Software are even better. Don’t miss out in the advances in TapRooT® Technology. Get the latest by clicking on the links above and updating your technology.
Also, as more new books, courses, and software improvements are released as the year progresses, we will let you know by posting information here. Keep watch and keep up with the latest in advanced root cause analysis.
Are you sending people to our Public TapRooT® Training?
Or are you having a TapRooT® Course at your site?
And arranging TapRooT® Training at one or more of your facilities around the world?
If you want to choose your dates, now is the time to get your onsite courses scheduled.
And if you want to choose a particular public course, now is the time to get your folks registered!
Did you make your New Year’s resolutions? Your ideas to improve your performance next year?
In many companies, you are expected to have plans to improve performance. Better production performance, quality, equipment reliability, safety, process safety, and financial performance are all expected parts of the normal year-to-year improvement process. If you are leading any of these improvement efforts, you better have a plan.
What if you could do something to both improve your personal performance and your company’s performance? Would that be interesting?
Plan to attend a TapRooT® Root Cause Analysis Course!
What are you waiting for? TapRooT® Root Cause Analysis is proven by leading companies around the world to help them find and fix the root causes of performance problems. And the TapRooT® System can be used proactively to stop problems before major incidents happen. This can lead to improved financial performance in addition to improved safety, quality, equipment reliability, and production performance.
But beyond that, you will be adding an advanced skill to your toolbox that you can use for the rest of your career. Think of it as a magic problem-solving wand that you can use to astound others by the improvement initiatives you will lead. This can lead to promotions and personal financial gain. Sounds like a great personal improvement program.
Now is the time to make your plans for 2017. Get your courses scheduled. Get ready to make your skills better and your company a better place to work.
Read this story about a recent BP internal audit:
You can see why many managers don’t want written reports critical of any safety or environmental performance.
Does your company have any practices to mitigate bad press from internal audits?
A Report from the UK Rail Accident Investigation Branch:
Structural failure caused by scour at Lamington viaduct, South Lanarkshire, 31 December 2015
At 08:40 hrs on Thursday 31 December 2015, subsidence of Lamington viaduct resulted in serious deformation of the track as the 05:57 hrs Crewe to Glasgow passenger service passed over at a speed of about 110 mph (177 km/h). The viaduct spans the River Clyde between Lockerbie and Carstairs. Subsequent investigation showed that the viaduct’s central river pier had been partially undermined by scour following high river flow velocity the previous day. The line was closed for over seven weeks until Monday 22 February 2016 while emergency stabilisation works were completed.
The driver of an earlier train had reported a track defect on the viaduct at 07:28 hrs on the same morning, and following trains crossed the viaduct at low speed while a Network Rail track maintenance team was deployed to the site. The team found no significant track defects and normal running was resumed with the 05:57 hrs service being the first train to pass on the down line. Immediately after this occurred at 08:40 hrs, large track movements were noticed by the team, who immediately imposed an emergency speed restriction before closing the line after finding that the central pier was damaged.
The viaduct spans a river bend which causes water to wash against the sides of the piers. It was also known to have shallow foundations. These were among the factors that resulted in it being identified as being at high risk of scour in 2005. A scheme to provide permanent scour protection to the piers and abutments was due to be constructed during 2015, but this project was deferred until mid-2016 because a necessary environmental approval had not been obtained.
To mitigate the risk of scour, the viaduct was included on a list of vulnerable bridges for which special precautions were required during flood conditions. These precautions included monitoring of river levels and closing the line if a pre determined water level was exceeded. However, this process was no longer in use and there was no effective scour risk mitigation for over 100 of the most vulnerable structures across Scotland. This had occurred, in part, because organisational changes within Network Rail had led to the loss of knowledge and ownership of some structures issues.
Although unrelated to the incident, the RAIB found that defects in the central river pier had not been fully addressed by planned maintenance work. There was also no datum level marked on the structure which meant that survey information from different sources could not easily be compared to identify change.
As a result of this investigation, RAIB has made three recommendations to Network Rail relating to:
- the management of scour risk
- the response to defect reports affecting structures over water
- the management of control centre procedures.
Five learning points are also noted relating to effective management of scour risk.
For more information, see:
SHP reported that a worker at the Carlsberg brewery died and 22 others were injured by a cooling system ammonia leak.
Are you using advanced root cause analysis to investigate near-misses and stop major accidents? Major accidents can be avoided. That’s a lesson that all facilities with hazards should learn. For current advanced root cause analysis public courses being held around the world, see:
TapRooT® can be used for both low to medium risk incidents (including near-misses) and major accidents. For people who will normally be investigating low risk incidents, the 2-Day TapRooT® Root Cause Analysis Course is recommended.
For people who will investigate all types of incidents including near-misses and incidents with major consequences (or a potential for major consequences), we recommend the 5-Day Advanced Team Leader Training.
Don’t wait! If you have attended TapRooT® Training, get signed up today!
Teenagers seem to have no concept of how far away from death that they are. Very few over 25 would do this…