TapRooT® Keeps It Simple Despite Researchers Making Things Complex
One of our TapRooT® Instructors sent me a white paper written by three research academics about resilience and the human being a flexible part of the system. The researchers discovered that many root cause systems only look at failures rather than failures and successes. They called their “new” discovery and ways to look at performance “Safety II” and imply that it is clearly superior to “Safety I.”
The authors lay the blame for accidents in complex systems on poor thinking. They call this Safety I. This poor thinking includes poor accident models (their examples, the Domino Theory or Reason’s Swiss Cheese Model) and linear cause-and-effect thinking in root cause tools (tools they mention being defective include TRIPOD, AcciMap, and STAMP).
TapRooT® Root Cause Analysis doesn’t fall into the “Safety I” trap because we recognized resilience (but we never called it that) and humans as an active part of a successful system. When did we recognize this? Since TapRooT® was first developed in the late 1980s. And we have suggested that people use root cause analysis proactively to analyze success and failure ever since the early 1990s. Therefore, we are closer to “Safety II” than “Safety I” even though we appreciate what is good about both views of the world. Maybe we should say that we used the best parts of “Safety II” since before it was invented.
You might ask, why aren’t we proclaiming the value of resilience and how TapRooT® can handle complex accidents? Why don’t we jump aboard the “Safety II” bandwagon?
First, we don’t want to make root cause analysis seem any more complex than it already is. We build in human factors knowledge and resilience theory into the TapRooT® System without trying to make the system seem more complex. We use understandable English and try not to invent too many new terms. TapRooT® has always been built to handle complex systems and accidents and we think that people understand that. But we also want people to be able to quickly use TapRooT® Root Cause Analysis to analyze simple problems (which is why we built the “low-to-medium risk investigation process and the 2-Day TapRooT® Course to teach the techniques).
If possible, we prefer simple systems. We hope that engineers design simple, linear systems that aren’t complex and highly connected. We realized the problems with complex systems as far back as 1984 when Charles Perrow wrote the book Normal Accidents. We included the work of Jens Rasmussen into the Root Cause Tree® and included a discussion of his decision making model in our 5-Day TapRooT® Course and used it to teach the advantages of simplicity.
After all, you don’t want to have needlessly complex system that require human heroics just to make it succeed. And that is where I think ”Safety II” goes astray.
Second, we like simple models when they work. They are easy to explain. (They don’t require a PhD to understand them.)
Third, we disagree with some of the precepts of “Safety II.”
For example, the researchers think that needlessly complex system can be run successfully by well trained, flexible humans. The operator heroics can make up for a needlessly complex design and poor planning. They believe that the “work-arounds” invented by operators are a success that needs to be understood rather than a failure of planning that required the operators to “fly by the seat of their pants” because engineers and management produced an overly complex system without a well designed procedure and human interface.
We believe that in highly complex systems, we need to apply our human factors skills to simplify and decouple the system to make it more reliable. Just the opposite is happening in many industries (healthcare is a good example of the complexity problem).
So, instead of simplifying needlessly complex systems and improving reliability, the “Safety II” folks think that we need to try to understand how people (in the healthcare example in their article, doctors, nurses, and others) muddle through and get satisfactory results most of the time. Why? Because they think that the problems faced are “intractable” (unsolvable). Instead, we should be simplifying the system, reducing complexity, and applying best practices to improve reliability and, thus, reduce the “intractability” of the system. We need to make the human’s job more straightforward (no heroics required just to get through his or her day).
If you accept the theory of “Safety II” that:
- You can’t really understand and plan work in complex systems.
- Human variability is not an error but a normal part of the process that also helps it succeed.
- You can’t tell when something is working correctly or not (things are not “bi-modal”).
- That accidents (adverse outcomes) aren’t solely the result of failures, but rather are a combination of failures and normal performance variability.
Then you will find yourself mired in a complex system that only the most learned can comprehend.
If this whole discussion seems difficult to understand, that is OK. Highly complex systems are difficult to understand. That is why high reliability organizations try to reduce complexity and the interconnectedness of systems to make outcomes reliable. Keep It Simple Stupid! (KISS)
The solution to system complexity is not the complexity of “Safety II” but rather de-connecting, simplifying, and making work understandable (not intractable).
So what do we recommend?
- Don’t make things more complex that necessary. SIMPLIFY whenever possible! KISS.
- You need to understand failures and successes.
- Few problems are due to intractable systems. In those few cases, the systems need to be fixed.
- Work isn’t as hard to understand as one might think – especially for those involved in producing the outcomes.
- Of course, investigate significant events, but also, be proactive! Investigate success and precursor incidents.
There is considerable overlap of the five ideas listed above with “Safety II” … but also there are significant differences. The main difference is that we don’t have to accept the complexity, intractability, and interconnectedness of unreliable systems and hope that a human learns to cope with them. We CAN and should change the system!