In our 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training Course and in our TapRooT® book, TapRooT® Changing the Way the World Solves Problems, we introduce the Critical Human Action Profile (CHAP) tool to help collect more information to analyze any type of problem at the process task level. I like to call this looking at a problem at the 1 foot level as opposed to many investigations that analyze their problems at the 100,000 mile view only.
The tip here however, is “why wait for a problem to use CHAP?”
Identify, Evaluate and Improve before it is too late!
Using a very over simplified list of procedure steps on How to Remove a Fuel Pump, found on the internet, I would like to show you how to use CHAP proactively to improve Safety and Quality during a task.
WARNING: The steps listed in the demonstration example below on removing a fuel pump shall not be used. They are incomplete and not necessarily accurate.
Where to start? First off you already perform JHA, AHA, JSA, Observations…. So Going Out and Looking (GOAL) should not be new or require a lot more additional resources. The difference is that you will be utilizing your resources more efficiently.
1. Start by identifying a task performed by employees that are critical to:
a. Customer/client satisfaction
b. Product Quality
c. Project Timeliness
d. Employee Safety
e. Customer Safety
f. Environmental Exposure
2. Once the task is identified, list the steps to be performed like listed in the image below.
Note: Do not forget to use the Basic Cause Category Procedure in our TapRooT® Root Cause Tree to look for missing best practices as well when listing the steps.
3. Identify each step of the task that is critical to the items listed in step 1 criteria of this article.
Which steps listed above for the fuel pump removal do you think would be listed as critical?
4. For each critical step in the task perform a CHAP Profile.
Note: For each of the items listed below, do not forget to include the Best Practices listed under the Human Engineering Basic Cause Category in our TapRooT® Root Cause Tree.
Last year, a Delta employee lost his leg when it was crushed by the wheel on a jetway in Knoxville, Tennessee.
I had a little extra time waiting for my flight to Atlanta from Knoxville last Friday so I asked the gate agent about the accident and what had been done to prevent a repeat. She said they were now required to have a spotter to make sure that no one got near the wheels while the jetway was moving (the wheels aren’t visible from the jetway controls).
That’s a Human Action Safeguard.
She also said that no one is allowed to use the stairs or get near the wheels while the jetway is in motion. That was already true when the accident happened but it was re-emphasized to everyone after the accident.
That’s a rule “quasi-Safeguard” that requires human action (compliance) to work.
Thus, a near-fatal accident had two human action related Safeguards that are meant to prevent recurrence of the accident.
Here is a graphic from our root cause analysis training…
Now let’s evaluate the corrective actions used to prevent a possible future fatality using the graphic above…
First, we made a rule that required a spotter during moving of the jetway. This is a human action related Safeguard implemented through a rule. That is the second weakest type of corrective action (#5).
Reemphasizing a rule that previously failed (the second corrective action used) is a training related human performance Safeguard and is the weakest corrective action to prevent recurrence of the accident (#6).
What do you think? If you had a serious accident (lost leg due to crushing) and it had the potential to be fatal, would two weak corrective actions be enough?
Maybe we should start at the top of the hierarchy in the figure above and see what is the strongest reasonable Safeguard that we can employ is…
1. REMOVE THE HAZARD
The Hazard in this case is the jetway weight and moving pinch point when the jetway is in motion. This is difficult to remove. (At least I can’t think of a way to do it.)
2. REMOVE THE TARGET
With current aviation operations, people are required to direct the plane while parking, unload baggage, refuel the plane, etc. Perhaps someday this will be done robotically, but for now, removing people from the jetway environment seems unlikely.
3. GUARD THE TARGET
This one is possible. See this photo below from Frankfurt …
They have implemented a guard to keep people away from the wheels.
Is it 100% perfect? No. People can go around the guard (jump over it?).
Is it better than warning people to be careful?
Yes!
So I sent the photo above to the Knoxville airport management. We’ll see if there are changes in the future to implement a stronger Safeguard to the potentially fatal Hazard.
ARE WE DONE?
NO!
This corrective action (if implemented in Knoxville) only fixes one small set of Hazards – jetway pinch-points in Knoxville. This Hazard exists at airports around the world.
For corrective actions to the Generic Root Cause, Delta would need to get airports around the world to guard the Hazard.
Next time you board a plane at your local airport, see what kind of Safeguard is in place. If you don’t see any, send the airport management (you can usually find a “contact us” link at the airport’s web site) a link to this posting.
ONE MORE THING TO LEARN
How do you develop corrective actions? Do you start at the top of the Safeguard hierarchy and work your way down or do you start at the bottom and work your way up?
Your investigators should have their corrective actions evaluated to see how effective they will be. For potentially fatal accidents, I would recommend using the top three strongest on the list and sometimes allow the fourth if somehow the top three aren’t possible.
The bottom two can be allowed in combination with the top 4, but I would never allow them to be the only corrective action if a fatality was possible.
Stop taking the easy way out. Learn a lesson from this accident (and the corrective actions). Improve your corrective action process by using the strongest possible corrective actions.
We are all trained, or learn, by trial and error on how to use equipment or how to use it “properly”. What happens when you get a better “understanding” of how the equipment works? Here are some of the choices that we could make:
1. Ignore the previous training and just get the prize (work done faster, like the chimpanzee)
2. Continue the rules that you learned or were trained to do (at least in front of the bosses like the children).
3. Stop and ask what’s up?
4. Stop using the tool all together and do not tell anyone.
Often the previous training and experience overrides the new operation steps needed … ever been totally frustrated every time someone changes your computer’s Microsoft Windows version? And no, training by itself does not override experience, practice and repetition does!
I had a discussion not too long ago that OSHA forklift training requirements were met when people were retrained after changing forklifts. Unfortunately, the controls worked exactly opposite on the new forklift and the quick review did nothing to override the past knowledge and muscle motor memory.
Just something to think about when you think “Great Human Factors.”
“A TVA spokeswoman told the Chattanooga Times Free Press that the construction ‘stand down’ ordered to start at noon Wednesday was to continue ‘until the errors discovered are clearly communicated to all personnel.’”
Will communicating the “errors” really improve performance?
A TVA spokesperson said:
“TVA had not yet determined if the mistakes were due to carelessness but a ‘root cause analysis’ was being conducted.”
Carelessness as a potential “cause”?
TVA’s top executive, Tom Kilgore, said:
“When workers return to the site on Monday, they will join foremen and supervisors to review an error that occurred in December that had the potential for fatal consequences and that was identified earlier this week at Watts Bar Unit 2. Also to be reviewed is a second incident that occurred this week which could have resulted in a severe injury or worse if it had happened under slightly different circumstances.”
That tool box safety meeting shouldn’t take too long. From the report, they don’t know the root causes yet. All they seem to know is that two mistakes were made. I guess “foremen and supervisors” will just tell employees to “be more careful” and not to make errors. Then everything will be OK.
After that, employees will be willing to cooperate in an open and revealing root cause analysis. Especially when they know that management is looking for those who may have been careless.
We all know that the best way to keep people from being careless is to fire those who are found to be careless. If you fire careless people frequently, everyone will be happy and careful!
Another quote from the article:
“Nuclear Regulatory Commission Region 2 spokesman Roger Hannah said Friday that such work stoppages at nuclear plants are ‘not uncommon’ and probably occur every two or three years. Hannah said they are ‘not exclusive to the nuclear industry.’”
Wonder why they need a stand down every two or three years if they have an effective performance improvement program? I guess people need to be reminded to be more careful every two or three years.
Maybe we should just schedule these stand downs in advance? We could call it human performance preventive maintenance. Every two years we would give people a day off to think about being more careful and “Presto!” … no more human errors.
Or even better! Rate people on their potential for carelessness on a scale from 1 to 10. Then every year fire the worst 10%-20% of the careless employees!
Do these actions sound like the Deming Red Bead Experiment to anyone? If you don’t know what the Red Bead Experiment is, see the following videos…
Now read these quotes:
NRC’s “…Hannah declined to speculate about any possible penalty for TVA. He said TVA would assess both nuclear safety and workplace safety issues.”
And …
“The problems were discovered in routine TVA inspections and follow heightened NRC scrutiny on other TVA nuclear plants.”
Ahhh… now we are getting to the “root cause” of the stand downs.
It will look like management is doing something.
Management would hate to look like they are doing nothing.
A stand down makes them look like they are doing something.
The more people stand down, the more dramatic the effect.
Thus, a stand down may keep the NRC from descending upon a nuclear utility.
If NRC management starts to believe that TVA has multiple troubled plants with multiple reasons for concern about human performance and human reliability, that could result in a special inspection. A special inspection is bad. When multiple regulators descend upon a nuclear utility, they always find things that need to be improved. If too many areas need improvement, the NRC could order reactors shut down until the “culture” is changed.
An NRC ordered shut down is bad news for the utility. “Changing the culture” can take years, cost millions of dollars, and result in many managers being fired. That’s much worse than the impact of a simple stand down for a few days. Thus, a stand down is a cost-effective way to keep the NRC happy – at least for a while – even if the stand down has no lasting impact on human performance.
Is there a better approach?
How about honest recognition of mistakes big and small? Once the mistake is recognized, management could require a thorough, effective, advanced root cause analysis of any problem that could result in significant impact on plant safety, personnel safety, radiation exposure, environmental performance, or plant performance. Management could then insist upon the development and implementation of effective (SMARTER) corrective actions. Part of those corrective action could include effective communications about what happened and why it happened (the real root causes) to all employees that are impacted by the issue or the corrective actions.
What if you really want to stop having stand downs (and the incidents that cause management to call for stand downs)?
Management needs stop being REACTIVE by being PROACTIVE.
Management needs to shift from reactive root cause analysis to advanced PROACTIVE root cause analysis and stop problems before incidents happen. (We teach how to do this in our 5-Day TapRooT® Course.)
I’d recommend that TVA stop blaming workers (calling them careless) and start finding and fixes the real root causes of problems. Rather than a show stand down for the NRC, use effective advanced root cause analysis – both reactively and proactively – to improve performance and avoid issues that require stand downs every few years.
Show stand downs haven’t resulted in improved performance in the Nuclear Navy or the nuclear power industry (as evidence by the fact that they are repeated over and over again) and they should not be accepted by the NRC as effective management action. Rather, knee-jerk use of a stand down should be seen as a sign of weak management. Management that does not know how to improve human performance.
Avoid this scenario at your facility. Make sure that your management understands how to use advanced root cause analysis both reactively and proactively. Get your advanced root cause analysis program effectively implemented and then continue to improve it every year. And this advice is not just for nuclear utilities. Rather, it applies to every industry where mistakes may cause major accidents – oil, refining, chemical plants, aviation, railroads, shipping, pipelines, pharmaceutical manufacturers, mining, hospitals, …
Where can you learn best practices to continuously improve root cause analysis and human performance? Start at the 2012 Global TapRoot® Summit in Las Vegas on February 29 – March 2. See the schedule for all nine Summit Tracks at:
As an ex-aircraft mechanic and a “sometimes gotta work on my own car” mechanic, I have in the past borrowed or made some of the tools pictured below. The questions remain:
Wrong Tool?
Bad Access by Design?
Mechanic’s Ingenuity?
Or a little bit of them all?
Finally, ever have one of your modified tools bite you back? Share your stories in the comment section.
I wrote this paper for the for the BARQA Journal and they are nice enough to let me republish it here. Click on the pdf below to see the whole article.
The article is written for people interested in root cause analysis to improve pharmaceutical quality, but the problems discussed are common to all industries and apply to those looking to improve safety, operation, maintenance, process safety, and quality.
Sources of Root Cause Analysis Failures by Mark Paradies is published by:
Quasar (Members Magazine of BARQA, British Association of Research Quality Assurance) No. 118 Pages 7 – 10, Jan 2012.
“Carnival estimated yesterday it would have to pay at least $40 million in insurance deductibles following the wreck. It may also face as much as $95 million in lost voyage earnings this year without the use of Costa Concordia. The company further ‘anticipates other costs to the business that are not possible to determine at this time.‘”
ELECTRIC LINE FAILURE FROM CORROSION RESULTS IN INJURY
Country: USA – North America
Type of Activity: Construction, Commissioning, Decommissioning
Type of Injury: Struck by
U.S. Department of the Interior Bureau of Safety and Environmental Enforcement (BSEE) Safety Alert Number: 298
During well temporary abandonment operations, electric line (eline) was used to set a 1,000 pound cast iron bridge plug assembly (the assembly). When the assembly was approximately 6 inches from the deck, the eline parted near the rope socket. As the assembly fell, the Injured Person (IP), who was guiding the assembly to the well bore, was struck on the foot as a result of the IP being within the assembly’s potential fall radius.
What Went Wrong?:
A BSEE investigation revealed the following:
After the incident, the eline operator cut 1,500 feet of the eline off the drum and found the eline to be corroded and brittle with 5 out of the 18 wire rope strands broken.
The approved BSEE Permit to Modify stated the assembly would be run with the workstring, but the assembly was actually run with eline.
The Job Safety Analysis (JSA) was performed 9 hours prior to the job and did not identify all risks associated with the specific lifting operation; e.g., the job required a worker to be within the assembly’s 9 feet potential fall radius but risk assessment of the assembly’s potential fall hazard, and the eline’s condition/capability for making the lift were not addressed.
Findings from a third party lab’s visual examination of the eline indicated corrosion and pitting, with the fractured outer wire strands distorted and bent in a way indicative of shear/overload fracturing due to corrosion. A scanning microscope examination also revealed the fractured surfaces were battered, abraded and corroded, also revealing shear/overload due to corrosion.
The eline operator couldn’t provide any standard operating procedures or long term preventative maintenance records for the eline unit.
Corrective Actions and Recommendations:
Therefore, the BSEE recommends:
1. Eline/wireline operators develop and maintain standard operating procedures and records for the eline/wireline units to include preventative maintenance protocol, visual inspection of the wire rope associated with these units and wire rope change-out records (similar to crane wire rope protocols). Lessees should request a copy of these eline/wireline procedures and records.
2. Lessees and its contractors review BSEE Safety Alert No. 282 that discusses the need for workers to understand it is not the JSA Form alone that will keep them safe on the job but rather the process the JSA represents. It is of little value to identify hazards and devise proper controls if the controls are not put in place.
Source Contact:
Glynn T. Breaux
+1 (504) 736-2560
This alert is being distributed via a partnership between the International Association of Oil and Gas Producers (http://www.ogp.org.uk/) and the U.S. Department of the Interior Bureau of Safety and Environmental Enforcement (BSEE) (http://bsee.gov/).
Whilst every effort has been made to ensure the accuracy of the information contained in this publication, neither the OGP nor any of its members past present or future warrants its accuracy or will, regardless of its or their negligence, assume liability for any foreseeable or unforeseeable use made thereof, which liability is hereby excluded. Consequently, such use is at the recipient’s own risk on the basis that any use by the recipient constitutes agreement to the terms of this disclaimer. The recipient is obliged to inform any subsequent recipient of such terms.
This document may provide guidance supplemental to the requirements of local legislation. Nothing herein, however, is intended to replace, amend, supersede or otherwise depart from such requirements. In the event of any conflict or contradiction between the provisions of this document and local legislation, applicable laws shall prevail.
The article provides a basic description of the accident from an NTSB investigator:
“A train had stopped on the tracks about 1:18 p.m. Friday around County Road 600 North and County Road 500 East to allow another train to run around it. A second train came up behind the stopped train and struck it from behind. The third train, which was supposed to go around the first train, struck the wreckage from the first two trains.”
Knowing that policies guide what “how to’s” and “do what’s” need to be created, trained and used, why do they have to be so convoluted and difficult to read? Not to pick on lawyers, but have ever tried to understand a legal document? Aren’t legal documents supposed to keep you out of trouble and not get you in trouble?
Interestingly enough, we even pass policies on policies found in this article.
“On October 13, 2010, President Obama signed into law the “United States Plain Writing Act of 2010.” Thirteen years after President Clinton issued his own “Plain Writing in Government” memorandum, the revised set of guidelines states that by July of this year all government agencies must simplify the often perplexing bureaucratic jargon used in documents produced for the American public. Gone are the grammatically longwinded sentences, replaced with simpler English words, grammar and syntax”
Take this excerpt from a policy; what missing best practices can you identify from the TapRooT® Root Cause Tree?
“The amount of expenses reimbursed to a claimant under this subpart shall be reduced by any amount that the claimant receives from a collateral source in connection with the same act of international terrorism. In cases in which a claimant receives reimbursement under this subpart for expenses that also will or may be reimbursed from another source, the claimant shall subrogate the United States to the claim for payment from the collateral source up to the amount for which the claimant was reimbursed under this subpart.”
Using the Basic Cause Category “Procedures,” I look forward to your missing best practices in the comments section.
Last July, a train crash in China killed 40 people. According to CNN, the Chinese government has decided to punish 54 people for their roles in the accident. The story quotes the state-run Xinhua news agency as saying:
“According to a final investigation report, the train crash was caused by major design flaws in train operating equipment, relaxed safety controls and poor emergency response to equipment failure.”
The story also said that the probe:
“…exposed that the Ministry of Railway and the Shanghai Railway Bureau had failed to act properly after the accident and were unable to disclose relevant information on issues of social concern, leaving a negative social influence,”
So who lost their jobs or were disciplined? They include:
Liu Zhijun, the country’s former railway minister
Zhang Shuguang, the railway ministry’s deputy chief engineer
Xu Xiaoming, Guangzhou Railway Group Chairman
Miao Weizhong, China Railway Signal & Communication (CRSC) Deputy General Manager
Zhang Haifeng, Railway Signal Design Institute Chairman
No decision has been made about criminal charges.
No for my question…
If firing people improves safety, shouldn’t China have one of the best safety records in the world? It seems that every accident in China is followed by firings, discipline, and criminal prosecutions. But this doesn’t seem to make performance better.
The NTSB has decided that texting, e-mailing, or even chatting on a cell phone is too dangerous to be allowed.
Therefore, they are recommending that states pass laws that prohibit all use of electronic devices except those that aid the driver (like a GPS).
NTSB chairman Deborah Hersman said, “No email, no text, no update, no call is worth a human life.”
In one article, Jonathan Adkins, a spokesman for the Governors Highway Safety Association, said the recommendation was a “game changer” but said that, “States aren’t ready to support a total ban yet, but this may start the discussion.”
What do you think? Is a total ban (including all cell phone use) the right answer?
What about other activities that cause distractions?
Sometimes I wonder about the number of things that are now illegal.
Let me know your thoughts by leaving a comment here.
Dr. Gary Becker, an award-winning economist, spoke at a recent meeting that I was lucky enough to attend. One of the things he mentioned (and is famous for developing) is the idea of “Human Capital.”
Many economists calculate ways to optimize capital. Usually, this capital is money spent for plant facilities (hardware).
But Dr. Becker emphasizes a different type of capital – people or Human Capital.
He asks what companies are doing to:
• Improve human capital?
• Optimize human capital?
• Focus human capital?
• Keep human capital?
• Use human capital to its best advantage?
It seems that many companies forgot about the important investment they had in human capital. They downsized with a vengeance out of fear of the “Great Recession.” Now they find they have a shortage of qualified workers. They can’t find the skilled people they need and they find that the skilled workers they have aren’t loyal after watching coworkers get cut loose in bad economic times.
But have you heard the old saying:
“Better late than never.”
Even companies that didn’t think about the value of human capital during the recession should start thinking about it now.
Ask yourself…
“What are we doing to maximize human capital?”
Here are two action items to add to the list of things your company should be doing:
1.Get people trained to solve problems using TapRooT®.
2.Improve your TapRooT® Investigators’ skills by sending them to a pre-Summit course and the TapRooT® Summit.
The TapRooT® Training and the Summit will produce an amazing return on your human capital investment. The skills learned will be immediately useful for improving performance.
For more info about TapRooT® Training and the 2012 Global TapRooT® Summit, see the TapRooT® web site:
The picture above is from a airport jet bridge in Frankfurt, Germany.
If you look at the ground level you can just make out the wheels that carry a very heavy load.
You might also notice that they have a guard to keep people away.
Why did I notice this?
Because last year at the Knoxville airport a Delta employee was run over by these wheels. It totally crushed one leg.
There was no guard when the accident happened. Instead, Delta had a policy that all employees should be clear before the jet bridge was moved and stay clear while in motion.
Obviously, this administrative control (SPAC in TapRooT® lingo) failed (SPAC Not Used).
However, a physical guard might be a better safeguard than an administrative control.
Next time I get a chance I will have to see if the corrective action from the Knoxville accident was to add guards on the Knoxville jet bridges.
“The Nov. 18 incident at Toronto’s Pearson International Airport occurred after pilots of the commuter flight, an Embraer 145 arriving from Chicago, failed to follow an air-traffic controller’s instructions to stop short of an active runway.”
“Despite telling the airport tower they understood the instructions, according to people familiar with the details, the American Eagle pilots started to taxi the twin-engine jet across the strip at the same time an Air Canada Airbus A319 was cleared for takeoff and began accelerating toward them.”
“A controller barked “stop, stop, stop,” but the commuter jet still continued rolling for a few seconds, according to a preliminary review of traffic-control communications. The Air Canada plane, bound for Halifax, managed to become airborne before reaching the intersection. One eyewitness reportedly estimated that the two aircraft may have been separated by less than 50 feet, these people said.”
WXII Channel 12 reports about a man who has been involved in four serious accidents.
- a truck wreck where the truck burned up
- a fall from a cherry picker that left him hospitalized for six months
- a truck wreck where he was a passenger in the vehicle
- a truck wreck where a 65,000 pound block of concrete slide off a semi onto his pickup
“The distracted tugboat pilot who crashed a barge into a sightseeing “duck boat,” killing two tourists, was sentenced Tuesday to a year and a day in prison for his role in the incident, federal prosecutors said.”
The story also said:
“Devlin admitted that he was distracted by his cell phone and laptop for an extended period of time before the collision, that he piloted the Caribbean Sea from its lower wheelhouse where he had significantly reduced visibility and that he did not maintain a proper lookout or comply with other essential rules of seamanship, according to federal prosecutors.”
But there is more …
“The morning of the accident, on July 7, 2010, Devlin’s 6-year-old son was undergoing routine eye surgery when he experienced complications including a laryngospasm — which led to partial oxygen deprivation for eight minutes. Devlin’s wife said she panicked and called her husband, who was at the controls of the tug at the time, according to KYW.”
And even more …
“The sightseeing “duck boat” was anchored in the shipping channel after being shut down because the boat’s operator saw smoke and feared an onboard fire.”
Multiple causal factors and probably multiple opportunities to avoid this fatal accident.
Now the question:
Will Prison Time Keep Future Accidents From Happening?
The story ended with:
“Lawyers who represented the families of the two victims released a statement in July saying the families ‘are gratified that federal prosecutors have acted to hold one of the responsible parties accountable in this tragedy that should have been avoided.‘”
Should we seek prison time for those involved in accidents?
Two people who have faced criminal prosecution will discuss their personal experience in the “Criminal Prosecution of Accidents” session in the Leading Performance Improvement Track. To see the track schedule, click on the button for that track at:
I attended the Milken Conference held in LA. Gary Becker, Nobel Prize and Presidential Medal of Freedom winner, explained the theory of human behavior and rewards.
Once again, the material we teach in TapRooT® Courses was confirmed through a different science – economics.
His economic theory is that people act because of the rewards built into the system.
So, if your boss with an MBA starts blaming folks after an incident – especially if rules were broken and the “enforcement” system isn’t working as intended, tell him/her to look into the research of renowned economist Gary Becker.
People are rational … The rewards system is broken.
TapRooT®’s Corrective Action Helper® can help you fix it.
For more information about TapRooT® Training, see:
Also, see this link for information about the Fatigue Risk Management Course that Circadian Technologies is providing prior to the 2012 Global TapRooT® Summit being held in Las Vegas:
You get the call that there has been an incident that needs to be investigated. So, you begin mapping out the SnapCharT®, performing the root cause analysis or developing the corrective actions and this happens (Watch Video):
Never fails, too many Type “A” personalities in the room, and you are the one who has to facilitate the team. It does not matter whether you have a Type “A” or “B” personality, it can get ugly if it is not handled correctly, especially if someone was hurt (or worse) or if the company lost a lot of money. So what to do …
Here are a few facilitation hints:
1. Define who the team lead is upfront. (This prevents an Accountability NI issue.)
Note that the investigation facilitator does not have to be the one who is in charge. After all, the facilitator’s true role is to facilitate the TapRooT® 7 Step Root Cause Analysis Process, not necessarily the team members themselves. It can also help if the facilitator is a neutral person not familiar with the incident or process being investigated.
2. Allow all members to introduce themselves … often new people are introduced into an established team. The introduction gives a person, new or shy, the platform to speak up later.
3. While developing the SnapCharT®, (or time line for friends new to our process), ensure that all the people, equipment, and process actions that occurred are listed, whether people think they are an issue related to the incident or not. You can make a movie with a good time line of events.
Note that this enables the good actions of all members, divisions, contractors, clients and owners to be listed as well and removes some of the blame and finger pointing that can occur.
4. While using the Root Cause Tree Dictionary, Root Cause Tree and SnapCharT® to find Root Causes for your Causal Factors, it is never an “I am right ” or “You are wrong” discussion. Unknown to untrained TapRooT® team members, the facilitator has carried in the “Arbitrator”!
Great, another “A” type in the group you say? Well, yes and no, the “Arbitrator” is the Root Cause Tree Dictionary.
The Root Cause Tree has lots of experience and knowledge to gently nudge any group into the right choice. It comes with some explicit rules … facts, facts, facts! You select a root cause because it related to or impacted a particular Causal Factor. A Root Cause is not selected because you have already decided on what you want the corrective action to be. It is also not ignored because you think you cannot change it. Root causes are just the facts.
Here is an example of how the Root Cause Tree Dictionary arbitrates and removes the emotion for the Causal Factor of “Operator opened the Fuel Supply Valve with a Contaminated Fuel Supply.” This is just one of the Causal Factors for the Incident of a motor being damaged with lots of downtime costs.
Two team members are in a heated discussion as to whether the Operator could detect or could not detect the contamination while opening the valve …
One team member who believes that the Operator had the knowledge of the contamination in the line is focused on what was seen after the fuel supply system was opened up.
The other team member believes that the Operator could not see inside the system while opening the valve.
You, (as the facilitator), walk up to the arguing pair and without telling either member who may be right or may be wrong, you say, “Open up the Root Cause Tree Dictionary and tell me which fact (condition on the SnapChrarT®) matches the bullet in the Root Cause Tree Dictionary.” Now state the fact and say, “this relates to why the Operator opened the Fuel Supply Valve with a Contaminated Fuel Supply.”
By focusing on the facts as known by the operator at the time he was opening the valve, the contamination was unknown and not detectable. The contamination was identified after the fact and only after taking apart the manifolds and valve.
The “Arbitrator” saves the day again with emotions and opinions removed!
Try these steps and also let me know in the comment section, what else you have done to reduce bias and emotions during your investigation facilitation.
Circadian Technologies has published a white paper titled:
The Advantages & Disadvantages of 12-Hour Shifts:
A Balanced Perspective
Their press release states:
“12-hour shifts remain a much-debated topic in 24-hour operations. Do they cost more than 8-hour shifts? Are they safe? What impact do they have on alertness, health and productivity?
CIRCADIAN®, the global leader in providing 24/7 workforce performance and safety solutions for businesses that operate around the clock, has collected considerable data on the benefits and complications of 12-hour shifts. The goal of this white paper is to provide you with a balanced perspective of 12-hour shifts – one that will examine the pros and cons from both a management and shiftworker perspective.”
Here’s the link to register to receive this report:
being held on February 27-28, 2012, in Las Vegas just prior to the TapRooT® Summit.
This training, being provided by Circadian Technologies, will help you:
• Design and implement a cost-effective Fatigue-Risk Management System
• Assess the risks and costs of fatigue in your business
• Determine safe staffing levels and optimal shift/duty patterns for your operation
• Train employees and supervisors to mitigate fatigue risk
• Improve employee health, safety, and quality of life
Also, the Summit (February 29 – March 2 in Las Vegas) has two sessions on fatigue as well as other sessions on improving human performance. Don’t miss it!
Just click on the course link above to get more information and to register.
Also, Bill Sirois, COO at Circadian Technologies, will be providing two talks about fatigue and fatigue risk management at the TapRooT® Summit. For complete Summit info, see:
I remember talking to a TapRooT® User about the root causes of bicycle accidents at his refinery. Bicycle accidents were the leading type of accident to cause injuries (imagine that with all the other hazards in a refinery).
A study of 244,388 death certificates issued from 1979 to 2006 conducted by a doctor at the University of California, San Diego, and published in the Journal of General Internal Medicine showed that fatal medication errors increased by 10 percent in July in counties with teaching hospitals.
Why might this be?
Because many new residents (interns) arrive from universities in July.
New interns don’t have experience, don’t know who does what, are learning what it is to work in a hospital, and, especially in the past, may work really long hours.
Circadian Technologies will be presenting a pre-Summit course on Fatigue Risk Management and two breakout sessions in different tracks at the Global TapRooT® Summit. But before you can attend these sessions, I thought that readers might want to learn just a little about the knowledge available from Circadian Technologies. Therefore, I’ve posted this link to a recent article they published titled:
I’m sure that it isn’t really final. (Aren’t we always learning and improving?) But I guess that they are saying they are done with the initial development and will now be starting to look at nuclear utilities to see that they have a positive safety culture.
What is a safety culture? The NRC says it is:
“… an organization’s collective commitment, by leaders and individuals, to emphasize safety as an overriding priority to competing goals and other considerations to ensure protection of people and the environment.”
The go on to define nine traits that nuclear plants should foster. They are:
1.Leadership Safety Values and Actions – Leaders demonstrate a commitment to safety in their decisions and behaviors;
2. Problem Identification and Resolution – Issues potentially impacting safety are promptly identified, fully evaluated, and promptly addressed and corrected commensurate with their significance;
3. Personal Accountability – All individuals take personal responsibility for safety;
4. Work Processes – The process of planning and controlling work activities is implemented so that safety is maintained;
5. Continuous Learning – Opportunities to learn about ways to ensure safety are sought out and implemented;
6. Environment for Raising Concerns – A safety-conscious work environment is maintained where personnel feel free to raise safety concerns without fear of retaliation, intimidation, harassment, or discrimination;
7. Effective Safety Communication – Communications maintain a focus on safety;
8. Respectful Work Environment – Trust and respect permeate the organization;
9. Questioning Attitude – Individuals avoid complacency and continuously challenge existing conditions and activities in order to identify discrepancies that might result in error or inappropriate action.
As I read more, I started thinking … As a utility, how do you measure your culture, diagnose opportunities for improvement, and demonstrate efforts to improve the safety culture at your plant?
Fortunately, we’ve been interested in safety culture for a long time and TapRooT® is able to help you diagnose safety culture issues.
But where to start? Safety culture is a tough subject. Many find it hard to be specific about culture issues. Even TapRooT® users sometimes don’t understand how TapRooT® can help them understand their safety culture and improve it.
So, we decided to build a course to help TapRooT® Users understand safety culture issues and then fix them.
This new course was built on a solid foundation of culture research and the bedrock of the TapRooT® System.
The FIRST public TapRooT® Analyzing and Fixing Safety Culture Issues will be held in Las Vegas on February 27-28, 2012. We will be putting out more information in future updates here on the Root Cause Analysis Blog, but, if you are interested in safety culture, you should plan to attend the course.
But the course isn’t all we are doing.
The TapRooT® Summit is also a great way for your company to demonstrate a commitment to improving your safety culture.
First, since problem identification and resolution is a major part of a positive safety culture, the Summit helps Summit participants by keeping them up-to-date on the latest incident investigation, troubleshooting, and root cause analysis technology. That way they can go back to their companies and make sure that issues really are fully evaluated and promptly addressed.
Second, another major trait that nuclear plants are suppose to foster is “continuous learning”. And continuous learning is what the Global TapRooT® Summit is all about. Learning best practices and new techniques from industry leaders from around the world. What better way to demonstrate your companies commitment to continuous learning than to send a team to the Summit and have them return to work with a custom plan to continuously improve performance.
Finally, the first trait mentioned by the NRC is that “Leaders demonstrate a commitment to safety in their decisions and behaviors.” But how can you demonstrate a leaders commitment? Participating in the Summit and making sure that your nuclear sites are well represented is an excellent way to demonstrate commitment. This is especially true because there is a best practice track for managers, the – “Leading Performance Improvement Track.”
The Leading Performance Improvement track includes these breakout sessions:
TapRooT® Implementation Success Stories (Leaders hear how others improved performance.)
What is Culture and How Do You Identify and Solve Culture Problems Using TapRooT® (Highlights from the 2-day course plus communication ideas to have effective safety communications.)
What Does Management Need to Know About Process Safety Improvement (Mark Paradies shares management lessons from Admiral Rickover and how missing elements of process safety management have contributed to major accidents.)
Deepwater Horizon: A Dramatic Portrayal (A dramatic presentation that can capture management’s attention and help them see the management roots of a major accident.)
Criminal Prosecution of Accidents (What happens to those involved when accidents become crimes. Two reports from people who experienced post-accident criminal investigations.)
Investigation & Root Cause Analysis Insights (Insights into root cause analysis from two perspectives – Mark Paradies, creator of TapRooT®, and a government regulator.)
Designing Your Continuous Improvement Program (Kevin McManus – the Systems Guy – shares practical lessons he has learned from industrial experience and his experience as a Malcolm Baldridge Award Senior Examiner.)
How Pfizer Achieves Operational Excellence (Hear Pfizer’s operational excellence story.)
In addition to these breakout sessions, all Summit participants hear from the excellent Keynote Speakers. Two of particular interest to senior management are:
Ken Mattingly speaking about “Lessons Learned from Apollo 13 and Space Shuttle Operations.”
Dr. Beverly Chiodo speaking about “Character Driven Success.”
So start planning for your management and all your improvement team members to be at the Global TapRooT® Summit in Las Vegas from February 27 – March 2, 2012, and attending a special 2-day course and the 3-day Summit to help you achieve – and demonstrate – a positive safety culture.
If you have ever sat in a TapRooT® Root Cause Analysis Course or Summit, you know that the transfer of knowledge and support from our instructors does not stop when the session ends. To help guide the next steps of continuous improvement, Mark Paradies and Linda Unger added Appendix C in our TapRooT® book, TapRooT®, Changing the Way the World Solves Problems. The tip today comes from “Topic 3: Knowledge” on page 461.
To ensure that TapRooT® Training is not just a one time event, we provide and suggest different knowledge opportunities:
The key concept to using and understanding knowledge is to identify the who, what, how and when as it relates to training. In our 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training, key investigation facilitators are introduced to the ADDIE process (Analyze, Define, Develop, Implement, Evaluate). The only way do Analyze and Define is to go out and look at the tasks that people need to perform in order to be efficient. With that in mind let’s start with the following people:
1. Investigators
2. Certified Instructors
3. Managers
4. Improvement Program Leader (Owner/Champion)
5. Coaches/Mentors/Facilitators
6. Hands on Employees/Operators
7. Top Manager (Sponsor)
Start by identifying their core task and skills required to perform the tasks. You may find cross-over of tasks which is not a problem. Actually it gives you more resources to share in times of need.
Once you identify the tasks and possible skills, assess the level of knowledge needed. Here is a template from my U.S. Air Force training Matrix in our CFETP:
Task Performance Levels
1. Can do simple parts of the task. Needs to be told or shown how to do most of
the task. (Extremely Limited)
2. Can do most parts of the task. Needs only help on hardest parts. (Partially
Proficient)
3. Can do all parts of the task. Needs only a spot check of completed work.
(Competent)
4. Can do the complete task quickly and accurately. Can tell or show others how
to do the task. (Highly Proficient)
Task Knowledge Levels
a. Can name parts, tools, and simple facts about the task. (Nomenclature)
b. Can determine step-by-step procedures for doing the task. (Procedures)
c. Can identify why and when the task must be done and why each step is needed.
(Operating Principles)
d. Can predict, isolate, and resolve problems about the task. (Advanced Theory)
Subject Knowledge Levels
A. Can identify basic facts and terms about the subject. (Facts)
B. Can identify relationship of basic facts and state general principles about the
subject. (Principles)
C. Can analyze facts and principles and draw conclusions about the subject.
(Analysis)
D. Can evaluate conditions and make proper decisions about the subject.
(Evaluation)
By identifying the who, what and how, then we need to figure out where your TapRooT® Root Cause students will get to the performance levels needed to reduce or prevent problems (Incidents).
Biggest key here is that you will need to assess the skills of each team member listed above; where it starts:
1. Good Root Cause Analysis starts with a robust and usable method taught by knowledgeable facilitators; do this by sending them to the appropriate course. We teach and then give hands-on exercises; we follow up by working one on one with students as needed.
2. Develop in-house mentors/facilatators and assign those mentors as needed to help newly trained individuals. Some even get certified to teach in-house.
3. Look for systemic issues and identify additional knowledge and performance gaps. Decide who in the list above may need to attend one of the pre-Summit or Summit Activities.
4. Develop in-house group sessions to discuss lessons learned.
5. Schedule refresher training to give competency levels high.
Lot’s of good points about the complexity of automation and computer assisted flight.
What I thought was “shocking” was that these pilots only had three minutes to diagnose a pretty complex problem that was sprung on them in the middle of the night with zero warning. Needless to say … things didn’t work out so well. And they called it pilot error…
Understanding human errors is one of the toughest jobs for accident investigators … and one of the reasons why TapRooT® is used around the world.
“The pilots of an Air France jet that crashed into the Atlantic Ocean two years ago apparently became distracted with faulty airspeed indicators and failed to properly deal with other vital systems, including adjusting engine thrust, according to people familiar with preliminary findings from the plane’s recorders.”
There’s more details in the article but I started thinking … WHY did the pilots respond inappropriately?
I naturally started thinking through the 15 Questions for troubleshooting human error on the front side of the Root Cause Tree®.
Were the pilots fatigued? Did that cause them to react slowly (this was an overnight flight).
What about the automation and displays in the cockpit? I’ve heard that people can easily get confused in Airbus cockpits and easily loose situational awareness.
What about the pilots training? Usually international pilots are some of the most senior but it is worth a look.
The article said:
“Slated to be disclosed by investigators on Friday, the sequence of events captured on the recorders is expected to highlight that the jet slowed dangerously shortly after the autopilot disconnected. The pilots almost immediately faced the beginning of what became a series of automation failures or disconnects related to problems with the plane’s airspeed sensors, these people said.”
and
“The crew methodically tried to respond to the warnings, according to people familiar with the probe, but apparently had difficulty sorting out the warning messages, chimes and other cues while also keeping close track of essential displays showing engine power and aircraft trajectory.”
Makes me think that human-machine design is much more the cause than simple “human error.”
The article also says:
“The Air France pilots were never trained to handle precisely such an emergency, …”
Ahhh … training may be an issue.
And what about crew teamwork?
What about previous incidents?
The WSJ also reports:
“The previous interim report indicated that in late March 2009, less than three months before the crash, European aviation regulators decided that the string of pitot-icing problems on widebody Airbus models wasn’t serious enough to require mandatory replacement of pitot tubes.”
The Root Cause Tree® really does help analyze these types of problems.
To get your training to use the Root Cause Tree®, attend one of the public TapRooT® Courses:
Just checking to see if TapRooT® Users got their Root Cause Network™ Newsletter today. I think you will find the Six Common Culture Problems story on page one both interesting and helpful when assessing culture issues.
Here’s a copy for download if you didn’t get yours by e-mail:
Here’s a CNN story about the changes made to the Canadian Air Traffic Control System since 1996 …
Would it start to address the problems we are experiencing in the US? A good root cause analysis might be the first step to answering that question (rather than just firing controllers and supervisors).