Rafael Moure-Eraso, Chairman of the Chemical Safety Board, sent the letter below to Xcel Energy Inc., a utility with its headquarters in Minnesota. I’ve never seen a letter written so strongly from an investigator about the lack of cooperation about an investigation. Have you?
It would certainly be interesting to know more about what happened to cause the lack of cooperation.
The UK RAIB’s report had three “Learning Points”:
1. Repeated occurrences of the same or closely related faults are likely to be a symptom of an underlying problem. Systems should be in place to identify repeated faults and to implement effective remedial action.
2. Maintenance requirements, particularly those applying to equipment connected with safety (such as the maintenance of gate stops (paragraph 13)), should not be left to local interpretation but should be determined by a competent person and recorded in a maintenance document.
3. It is important that signallers and crossing keepers at crossings of this type are given an unobstructed view of the gates, where it is practicable to do so.
In an earlier posting, we laid out the Causal Factors immediately before the well blowout as described by Terry Barr.
Now someone else has helped us identify the Causal Factors associated with the well design and construction. The Committee on Energy and Commerce investigation into the well blowout has identified 5 Causal Factors in a letter to Tony Hayward dated June 14, 2010. That letter is also covered in a Wall Street Journal article.
I’ll summarize the Causal Factors here and let you read the details in the letter liked to above.
Well Design and Construction Causal Factors
Choice of the cheaper, but less safe, well completion liner option to complete the well.
Using to few casing centralizers for the well design.
Failure to perform a cement bond log.
Failure to circulate the mud prior to cementing per the API standard.
Failure to deploy the casing hanger lockdown sleeve prior to replacing the mud with seawater.
That makes a total of 12 Causal Factors for the incident BEFORE the blowout preventer failed.
The blowout preventer failure will have one or more Causal Factors and the failures to contain and cleanup the spill and minimize environmental damage will have multiple Causal Factors. Of course, the multiple number of failures is “normal” in an accident of this significance. And when all these Causal Factors are analyzed for their root causes, there will be a significant number of ways that BP, and perhaps the industry, can learn from this accident and improve performance so that we don’t have to kill 11 workers and cause an environmental nightmare ever again.
One last note … All the Causal Factors mentioned here are based on publicly available information. We haven’t done any interviews or collected any first-hand information. It would be nice to see a fully qualified investigative team use advanced tools to perform a real root cause analysis on the first-hand data.
Also, I have posted the Congressional Letter below to make sure that it is available to those reviewing this article in the future…
This will be a difficult investigation. My guess is that there is more thane one Causal Factor – more than just a failure of the blowout preventer – that led to this disaster.
It’s interesting to watch management statements that are initially blaming an “equipment failure” for the accident.
Let’s hope unbiased data is released so that we all can make up our own minds.
Sometime people have an accident happen to them and nothing is learned. On the other hand, an accident can provide an opportunity to see problems in a different light.
Linda Kenney was the “victim” of a sentinel event. But the learning she has led after the sentinel event isn’t about how to prevent mistakes. Rather, she helped people see that doctors and patients, and their loved ones need support after these types of accidents.
It is still unknown why an operator started filling (with raffinate) an already full (98% full on his display) column.
And he continued to fill the column for a couple of hours longer than it would have taken to fill it if it was empty.
Perhaps is was the level indicator that had never been calibrated in the past 10 years and indicated that level was slightly decreasing while he continued to overfill the tank (many maintenance items had been backlogged for a long time).
Perhaps it was the operators’ practice of making sure that they were at the upper end of the indicating level before starting up (a practice that was counter to the operating procedure that nobody followed).
Perhaps it was that the top of the operating range of the level indicator was only 15% up the column and there was no accurate level indication if you exceeded that level.
Perhaps it was that the second high level alarm failed to sound.
Perhaps it was the fatigue that slowly slips up on an operator when they work weeks upon weeks without a day off and with extensive overtime (12 hours days and 7 day weeks).
Perhaps it was that he received no turnover on the plant status at the start of the shift and the log book only had a cryptic note about “packing” the column with “raf.”
Maybe it was all these combined.
Then, as he tried to start up the unit for the first time (he had never done a startup on this “simple” unit before) without a supervisor (who went to check on one of his kid with a broken arm) and without a relief for the other three plants he was already running (the relief was required by procedures but they were short on staff), while they also ran a safety meeting in his control room, he couldn’t understand why the process behaved strangely … Why pressure stayed too high … Why venting (using an alternate path because the normal path was out of service) didn’t work. Even talking to the supervisor on his cell didn’t give him any good ideas.
Then, when he tried to take fluid out of the column, he actually made the problem worse by causing rapid boiling of the raffinate and a huge overflow into a knock-out drum that was never sized for this type of overflow.
The result? Hot, flammable raffinate spewed forth from a stack (not a flare) and formed a large vapor cloud that reached an ignition source and caused a large explosion and fire.
This would have been less disastrous if some temporary, non-blast hardened trailers had not been located close to the stack. They were flattened. The majority of the 15 people killed in the blast and fire were killed in these trailers. Why was the waiver for these temporary trailers approved? Shouldn’t they have at least been “blast-proof”? The company’s risk assessment said this was a low risk area and that a large release of hydrocarbons was impossible (or at least highly unlikely).
And it all happened on this day in March of 2005 – five years ago.
People are already starting to forget the lessons learned (if they were learned) from this sad explosion and fire. But if you would like to review materials to keep the accident fresh in your memory, here is a wealth of information including reports and links to previous blog postings…
This looks like they should have been applying Equifactor® before the accident to handle the equipment reliability problems they were having.
Also, see the lessons learned at the end of the “AccidentRussianHydroPlant.pdf” that is linked to above. Do you think they were based on a through root cause analysis?
Wouldn’t it have been nice to see a real TapRooT® Investigation of this accident…
Imagine a good, complete summer SnapCharT®. And root causes identified for each Causal Factor by using the Root Cause Tree®. And corrective actions developed using the Corrective Action Helper® Module and SMARTER.
How much knowledge is lost because we don’t effectively investigate problems?
Some accidents are so historic that every accident investigator should know about them. The Challenger is one of those. It happened 24 years ago today. Dana Barclay, one of our TapRooT® Instructors with an Navy flight background, assisted with this massive investigation. Here is a link to the Report of the Presidential Commission:
The draft requires all nuclear material licensees (companies that operate reactors and that use or manufacture nuclear material) should demonstrate a positive nuclear safety culture. But how?
Here’s an idea…
One of the characteristics of a positive safety culture outlined in the draft policy statement is:
“The organization maintains a continuous learning environment in which opportunities to improve safety and security are sought out and implemented.”
The policy statement then includes examples. One example is:
“Personnel seek out and implement opportunities to improve safety and security performance.”
One great opportunity to demonstrate a site’s commitment to a positive safety culture is to have a team attend the TapRooT® Summit and implement best practices that they learn at the Summit. This demonstrates that personnel are seeking out and implementing “opportunities to improve safety and security performance.” Especially if you bring a couple of security folks with your Summit improvement team.
So, if you are planning how you can demonstrate to the NRC that you have a positive safety culture, don’t forget to explain how your improvement team attending the Summit is an example of efforts to maintain a continuous learning environment.
The UK Rail Accident Investigation Branch (RAIB) has released its annual report which covers the operational period of 2008. The RAIB published 27 investigation reports and 3 bulletins in 2008. This total includes one report into an investigation opened in 2006; 21 reports into investigations opened in 2007 and 5 reports into investigations opened in 2008. In total, these reports contained 181 recommendations. For the complete report, see:
Ladder safety is a tough issue. In the UK there have been rumors that the UK H&SE has banned the use of all ladders. The Ladder Book tries to dispell these rumors and help people start learn how to choose the right ladder for the right job.
Here’s a link to downlaod a PDF of this mini-book:
The BC Safety Authority (BCSA) has released its State of Safety Report 2008 which provides an overview of reported incidents related to the technologies that it regulates.
The report also summarizes the outcome of inspections carried out by its safety officers, analyses inspection data and identifies safety risks.
In 2008, there were a total of 456 incidents reported to the BCSA. This represents a 5% decline from the 483 reported incidents in 2007. Only 359 were directly related to regulated equipment or operations under the BCSA’s jurisdiction.
There were fewer minor injuries last year at 80 from 151 in 2007. Major injuries were higher at 58 from 10 in the previous year. The rise in major injuries was largely caused by a single incident that sent 27 people to a hospital for carbon monoxide exposure. Two deaths were also reported due to separate gas-related incidents.
The report also summarizes the enforcement activities conducted by the BCSA. In 2008, a total of 625 enforcement actions were carried out, most of which were compliance orders and suspensions of permit privileges.
The BCSA regulates the following seven technologies:
• Boilers, pressure vessels and refrigeration systems
• Electrical equipment and systems
• Elevating devices (elevators and escalators)
• Gas appliances and systems
• Passenger ropeways (including tramways, gondolas and ski lifts)
• Railways
The BCSA considers the data it collects every year as fundamental to its operations.
According to Harry Diemer, the BCSA’s President and Chief Executive Officer, “The data allows us to identify high-risk areas and create strategies to reduce risk and prevent accidents across our province.”
“Year-over-year safety will improve as we continue to develop tools such as risk control plans, incident investigation skills and root cause analysis to reduce risk and prevent accidents.”
Diemer also pointed out that education was “a major initiative and priority” in 2008.
“There must be much wider, and better, use of root-cause analysis, which is an investigative method that seeks to identify the underlying causes of an incident, with a view to preventing its repetition.”
“There are serious deficiencies in the undergraduate medical curriculum, Tomorrow’s Doctors, which are detrimental to patient safety, in respect of training in:
•clinical pharmacology and therapeutics;
•diagnostic skills;
•non-technical skills; and
•root-cause analysis.“
“The apparent paucity of effective root-cause analysis in the NHS, along with other potential drawbacks of self-investigation by NHS organisations, raises the question of whether there ought to be something akin to the Air Accident Investigation Branch for healthcare.“
“There are serious deficiencies in the undergraduate medical curriculum, which are detrimental to patient safety, in respect of training in: clinical pharmacology and therapeutics; diagnostic skills; non-technical skills; and root-cause analysis.“
We’ve had several people from the UK NHS come to TapRooT® Training. All had very positive comments. Perhaps it’s time for wider use of advanced root cause analysis in the UK health system?