The Summit starts next Wednesday, but I’ll be posting some of the talks so that people attending the Sumnmit can print them to take notes and preview what they will hear.
Caution: A PowerPoint of a talk isn’t the same as a talk. I know in my talks, there are things that are said that make the slides “come alive.” Also, some of the bullets on the slides are “for discussion.” They aren’t meant as a final conclusion, but rather as a starting point. Therefore, these talks should be observed and participated in at the Summit rather than being read as a final statement.
here is the Summit intro talk that will be delivered by Mark Paradies, Linda Unger, Ed Skompski, Ken Reed, Chris Vallee, and David Janney.
5-Day TapRooT® Advanced Root Cause Analysis Course in Macaé, Brazil, on November 8-12, In PortuguesePosted: October 6th, 2010 in Courses, Documents, TapRooT
UK Rail Accident Investigation Branch Publishes a Report on the Derailment at Dingwall, Scotland, 22 January 2010Posted: September 30th, 2010 in Accidents, Current Events, Documents, Investigations
Safety & Health Practitioner (the official magazine of the Institute of Occupational Safety & Health) published an article titled: “Blame culture prominent on Transocean rigs“.
The first paragraph said:
“A culture of fear and blame is rife across the operations of offshore drilling contractor Transocean, according to a leaked HSE inspection report.”
Wow! That’s certainly an “explosive” claim. Especially with the weight of the UK Health and Safety Executive behind it.
For the complete story, see:
Teaching root cause analysis, I have lots of people ask me about 12 hours shifts, fatigue, and safety. If this is a question you are interested in, I have a free publication that might interest you …
It is written by the experts in fatigue and shift scheduling, Circadian Technologies. Get your copy at:
The American Petroleum Institute and the American National Standards Institute have published a recommended practice titled: Fatigue Risk Management Systems for Personnel in the Refining and Petrochemical Industries (ANSI/API Recommended Practice 775, First Edition, April 2010).
You can download the standard at this site:
Now, what do incident investigators need to know about this standard when performing a root cause analysis? If you are at a refinery or petrochemical plant, you are required to consider fatigue when doing your investigation. The standard says:
4.7 Incident/Near Miss Investigation
The investigation of incidents should be conducted in a manner that facilitates the determination of the role, if any, of fatigue as a root cause or contributing cause to the incident. Information collected should include the time of the incident, the shift pattern, including the number of consecutive shifts worked, the number of hours awake, the number of hours of sleep in the past 24 hours by the individuals involved; the shift duration (and any overtime worked); whether the incident occured under normal operations or an extended shift; whether an outage was occurring; and, other fatigue factors. It should be noted that for individual incidents, often no definitive conclusion regarding the role of fatigue may be possible. However, aggregate analysis of incidents may reveal patterns suggestive of the role of fatigue that is not apparent by evaluating incidents individually.
When using TapRooT®, fatigue has always been considered as part of the “15 Questions” asked for every Human Performance Difficulty. The first question asks:
“Was the person excessively fatigues, impaired, upset, distracted, or overwhelmed?”
This question is expanded on in the Root Cause Tree® Dictionary. These questions were developed with the help of Circadian Technologies. We also worked with them to develop a free internet based fatigue evaluation tool called FACTS (Fatigue Accident/Incident Causation Testing System). You can try it for free at:
Want to find out more about fatigue and FACTS? The attend the TapRooT® Summit. Rainer Gutkuhn (one of the designers of FACTS) will show attendees in the Behavior Change & Stopping Human Error Track how to use FACTS in an investigation.
That’s just one of the many great sessions at the Summit. See:
for the complete Summit Schedule.
Here’s an old document (1979 – in pdf format) where Admiral Rickover set out his tenets that assured reactor safety in the Nuclear Navy.
We will be discussing his philosophy at the TapRooT® Summit in the “Lessons Learned About Excellence and Safety From Admiral Rickover” session in the Improvement Program Best Practices Track on Thursday from 10:40-12.
Don’t miss this session where where you can learn how Process Safety Management and Operational Excellence originated.
To register for the Summit, go to:
And if your were a Navy Nuc … leave your comments here about your experience in the Nuclear navy and how it changed your approach to operations, maintenance, or life.
The Nuclear Regulatory Commission has developed 13 safety culture components that were updated and released earlier this year. They are:
- Work Control
- Work Practices
- Corrective action program
- Operating experience
- Self and independent assessments
- Environment for raising safety concerns
- Preventing, detecting, and mitigating perceptions of retaliation
- Continuous learning environment
- Organizational change management
- Safety policies
To read more about these safety culture components, see this NRC document:
Rafael Moure-Eraso, Chairman of the Chemical Safety Board, sent the letter below to Xcel Energy Inc., a utility with its headquarters in Minnesota. I’ve never seen a letter written so strongly from an investigator about the lack of cooperation about an investigation. Have you?
It would certainly be interesting to know more about what happened to cause the lack of cooperation.
Here’s link to the letter:
Here’s a pdf of the letter:
Monday Accident & Lessons Learned: UK Rail Accident Investigation Branch Publishes Bulletin About a Train Collision with a Level Crossing GatePosted: July 26th, 2010 in Accidents, Current Events, Documents, Investigations
The UK RAIB’s report had three “Learning Points”:
1. Repeated occurrences of the same or closely related faults are likely to be a symptom of an underlying problem. Systems should be in place to identify repeated faults and to implement effective remedial action.
2. Maintenance requirements, particularly those applying to equipment connected with safety (such as the maintenance of gate stops (paragraph 13)), should not be left to local interpretation but should be determined by a competent person and recorded in a maintenance document.
3. It is important that signallers and crossing keepers at crossings of this type are given an unobstructed view of the gates, where it is practicable to do so.
To read the whole article, see:
Here’s a PDF of Robert Bea’s Preliminary Findings About the BP/Transocean Deepwater Horizon AccidentPosted: July 6th, 2010 in Accidents, Current Events, Documents, Investigations, Root Causes
Here’s a PDF of the preliminary BP Investigation downloaded from the House of Representatives Energy and Commerce web site:
Review the slides and see what you think.
Compare their four “critical factors” to the multiple Causal Factors at these two links:
What are they missing if they don’t look at additional Causal Factors?
Anything else that you see about this investigation presentation that makes it easy or hard to understand?
Please leave your comments.
When the unthinkable happens will you be ready?
Open the PDF and see what you can learn before the Summit.
Here’s the link to register for the course:
Monday Accident & Lessons Learned: Fatal Tram Pedestrial Accident Investigation is an Interesting Investigation and Good RecommendationsPosted: June 21st, 2010 in Accidents, Current Events, Documents, Investigations
For the investigation report from the UK RAIB about an accident in Norbreck, UK, see:
In an earlier posting, we laid out the Causal Factors immediately before the well blowout as described by Terry Barr.
Now someone else has helped us identify the Causal Factors associated with the well design and construction. The Committee on Energy and Commerce investigation into the well blowout has identified 5 Causal Factors in a letter to Tony Hayward dated June 14, 2010. That letter is also covered in a Wall Street Journal article.
I’ll summarize the Causal Factors here and let you read the details in the letter liked to above.
Well Design and Construction Causal Factors
- Choice of the cheaper, but less safe, well completion liner option to complete the well.
- Using to few casing centralizers for the well design.
- Failure to perform a cement bond log.
- Failure to circulate the mud prior to cementing per the API standard.
- Failure to deploy the casing hanger lockdown sleeve prior to replacing the mud with seawater.
That makes a total of 12 Causal Factors for the incident BEFORE the blowout preventer failed.
The blowout preventer failure will have one or more Causal Factors and the failures to contain and cleanup the spill and minimize environmental damage will have multiple Causal Factors. Of course, the multiple number of failures is “normal” in an accident of this significance. And when all these Causal Factors are analyzed for their root causes, there will be a significant number of ways that BP, and perhaps the industry, can learn from this accident and improve performance so that we don’t have to kill 11 workers and cause an environmental nightmare ever again.
One last note … All the Causal Factors mentioned here are based on publicly available information. We haven’t done any interviews or collected any first-hand information. It would be nice to see a fully qualified investigative team use advanced tools to perform a real root cause analysis on the first-hand data.
Also, I have posted the Congressional Letter below to make sure that it is available to those reviewing this article in the future…
Mark’s Talk About the Heinrich Pyramid (Safety Pyramid) at the European Safety Committee of the Conference BoardPosted: June 1st, 2010 in Current Events, Documents, Performance Improvement, Pictures, Presentations
That’s me and the interested participants at the Conference Board…
Below is a copy of a PDF of the PowerPoint that I used.
For those who are really following this, here are some interesting links…
Forbes on what BP knows and speculation on causes:
Lawyer’s comments on the BOP failure:
Engineering Ethics Blog:
Investment information from a Wall Street source about he facts, causes, and liability (neat stuff):
First, hear a survivor account of the accident…
There are two parts. Both are interesting.
Then see these pictures …
Here’s the pdf that was sent to me from an oil industry source that has the pictures in it…
This will be a difficult investigation. My guess is that there is more thane one Causal Factor – more than just a failure of the blowout preventer – that led to this disaster.
It’s interesting to watch management statements that are initially blaming an “equipment failure” for the accident.
Let’s hope unbiased data is released so that we all can make up our own minds.
Sometime people have an accident happen to them and nothing is learned. On the other hand, an accident can provide an opportunity to see problems in a different light.
Linda Kenney was the “victim” of a sentinel event. But the learning she has led after the sentinel event isn’t about how to prevent mistakes. Rather, she helped people see that doctors and patients, and their loved ones need support after these types of accidents.
Read about her story at:
It is still unknown why an operator started filling (with raffinate) an already full (98% full on his display) column.
And he continued to fill the column for a couple of hours longer than it would have taken to fill it if it was empty.
Perhaps is was the level indicator that had never been calibrated in the past 10 years and indicated that level was slightly decreasing while he continued to overfill the tank (many maintenance items had been backlogged for a long time).
Perhaps it was the operators’ practice of making sure that they were at the upper end of the indicating level before starting up (a practice that was counter to the operating procedure that nobody followed).
Perhaps it was that the top of the operating range of the level indicator was only 15% up the column and there was no accurate level indication if you exceeded that level.
Perhaps it was that the second high level alarm failed to sound.
Perhaps it was the fatigue that slowly slips up on an operator when they work weeks upon weeks without a day off and with extensive overtime (12 hours days and 7 day weeks).
Perhaps it was that he received no turnover on the plant status at the start of the shift and the log book only had a cryptic note about “packing” the column with “raf.”
Maybe it was all these combined.
Then, as he tried to start up the unit for the first time (he had never done a startup on this “simple” unit before) without a supervisor (who went to check on one of his kid with a broken arm) and without a relief for the other three plants he was already running (the relief was required by procedures but they were short on staff), while they also ran a safety meeting in his control room, he couldn’t understand why the process behaved strangely … Why pressure stayed too high … Why venting (using an alternate path because the normal path was out of service) didn’t work. Even talking to the supervisor on his cell didn’t give him any good ideas.
Then, when he tried to take fluid out of the column, he actually made the problem worse by causing rapid boiling of the raffinate and a huge overflow into a knock-out drum that was never sized for this type of overflow.
The result? Hot, flammable raffinate spewed forth from a stack (not a flare) and formed a large vapor cloud that reached an ignition source and caused a large explosion and fire.
This would have been less disastrous if some temporary, non-blast hardened trailers had not been located close to the stack. They were flattened. The majority of the 15 people killed in the blast and fire were killed in these trailers. Why was the waiver for these temporary trailers approved? Shouldn’t they have at least been “blast-proof”? The company’s risk assessment said this was a low risk area and that a large release of hydrocarbons was impossible (or at least highly unlikely).
And it all happened on this day in March of 2005 – five years ago.
People are already starting to forget the lessons learned (if they were learned) from this sad explosion and fire. But if you would like to review materials to keep the accident fresh in your memory, here is a wealth of information including reports and links to previous blog postings…
Baker Panel Report: Baker_panel_report.pdf
BP Press Conference Call About Baker Panel Report: BakerPanelConfCall.pdf
Bonse (Discipline) Report: Bonse Main Report.pdf
BP Accident Report (Mogford Report): Link to report
Telos Report: Link to report
Brown gets It Movie … (Quicktime .mov format – click below to play) …
Extensive evidence from Texas City Lawsuits: Link to web site
US CSB Report and Information: Link to CSB web site
Early e-Newsletter Articles: Link
Lessons Learned Talk by John Mogford: Link
Mark Gets Mad After Interim CSB Report: Link
Cost of US CSB Investigation: Link
BP Annual Report Note Story in Blog: Link
Interesting Deposition Videos: Link
Deposition Shows How Hard it is to Justify Performance After an Accident (Whole Deposition in Written Form from Don Paris): Link
Blog Post About Instrumentation Problems: Link
Blog Post on Cost Cutting Controversy: Link
Blog Post on Fatigued Operators: Link
Blog Post About BP Pleading Guilty to Felony: Link
Blog Post on BP Texas City Accident Cost: Link
Blog Post About BP CEO Admitting that He Never Read the US CSB Report: Link
Blog Post on EPA Fine: Link
Blog Post on $87 Million OSHA Fine for Missing Corrective Action Deadlines: Link
That should give you plenty to read to help you learn all there is to learn from the BP Texas City Refinery Explosion.
Now let’s take a few minutes to remember those who died to teach us these lessons.
We reported on the Sayano-Shushenskaya Hydro Accident previously at:
The accident resulted in 74 deaths and losses in the billions of dollars.
A TapRooT® User sent me some new information that I found interesting.
First, here is a pdf with lots of pictures and some analysis:
Here are a few of the pictures…
Second is a DOE web page with lessons learned. See:
This looks like they should have been applying Equifactor® before the accident to handle the equipment reliability problems they were having.
Also, see the lessons learned at the end of the “AccidentRussianHydroPlant.pdf” that is linked to above. Do you think they were based on a through root cause analysis?
Wouldn’t it have been nice to see a real TapRooT® Investigation of this accident…
Imagine a good, complete summer SnapCharT®. And root causes identified for each Causal Factor by using the Root Cause Tree®. And corrective actions developed using the Corrective Action Helper® Module and SMARTER.
How much knowledge is lost because we don’t effectively investigate problems?