Author Archives: Ken Reed

“People are SO Stupid”: Horrible Comments on LinkedIn

Posted: May 23rd, 2018 in Accidents, Human Performance, Root Cause Analysis Tips

 

 

How many people have seen those videos on LinkedIn and Facebook that show people doing really dumb things at work? It seems recently LinkedIn is just full of those types of videos. I’m sure it has something to do with their search algorithms that target those types of safety posts toward me. Still, there are a lot of them.

The videos themselves don’t bother me. They are showing real people doing unsafe things or accidents, which are happening every day in real life. What REALLY bothers me are the comments that people post under each video. Again concentrating on LinkedIn, people are commenting on how dumb people are, or how they wouldn’t put up with that, or “stupid is as stupid does!”

Here are a couple examples I pulled up in about 5 minutes of scrolling through my LinkedIn feed.  Click on the pictures to see the comments that were made with the entries:

 

 

 

 

 

 

 

 

 

 

 

Click on picture to watch Video

 

 

 

 

 

 

 

These comments often fall under several categories.  We can take a look at these comments as groups

“Those people are not following safety guideline xxxx.  I blame operator “A” for  this issue!”

Obviously, someone is not following a good practice.  If they were, we wouldn’t have had the issue, right?  It isn’t particularly helpful to just point out the obvious problem.  We should be asking ourselves, “Why did this person decide that it was OK to do this?”  Humans perform split-second risk assessments all the time, in every task they perform.  What we need to understand is the basis of a person’s risk assessment.  Just pointing out that they performed a poor assessment is too easy.  Getting to the root cause is much more important and useful when developing corrective actions.

“Operators were not paying attention / being careful.”

No kidding.  Humans are NEVER careful for extended periods of time.  People are only careful when reminded, until they’re not.  Watch your partner drive the car.  They are careful much of the time, and then we need to change the radio station, or the cell phone buzzes, etc.

Instead of just noting that people in the video are not being careful, we should note what safeguards were in place (or should have been in place) to account for the human not paying attention.  We should ask what else we could have done in order to help the human do a better job.  Finding the answers to these questions is much more helpful than just blaming the person.

These videos are showing up more and more frequently, and the comments on the videos are showing how easy it is to just blame people instead of doing a human performance-based root cause analysis of the issue.  In almost all cases, we don’t even have enough information in the video to make a sound analysis.  I challenge you to watch these videos and avoid blaming the individual, making the following assumptions:

  1.  The people in the video are not trying to get hurt / break the equipment / make a mistake
  2.  They are NOT stupid.  They are human.
  3.  There are systems that we could put in place that make it harder for the human to make a mistake (or at least make it easier to do it right).

When viewing these videos in this light, it is much more likely that we can learn something constructive from these mistakes, instead of just assigning blame.

“It was such a simple mistake!”

Posted: May 14th, 2018 in Investigations, Root Causes, TapRooT

mistake

 

 

 

 

 

 

 

 

 

When you have a major incident (fire, environmental release, etc.), your investigation will most likely identify several causal factors (CF) that, if they had not occurred, we probably would not have had the incident.  They are often relatively straight forward, and TapRooT® does a great job identifying those CFs and subsequent root causes.

Sometimes, the simplest problems can be the most frustrating to analyze and fix.  We think to ourselves, “How could the employee have made such a simple mistake?  He just needs to be more careful!”  Luckily, TapRooT® can help even with these “simple” mistakes.

Let’s look at an example.  Let’s say you are out on a ship at sea.  The vessel takes a bit of a roll, and a door goes shut on one of your employees.  His finger is caught in the door as it shuts, causing an injury.  Simple problem, right?  Maybe the employee should just be more aware of where he is putting his hands!  We will probably need more effective fixes if we really want to prevent this in the future.

How can we use TapRooT® to figure this out?  First of all, it is important to fully document the accident using a SnapCharT®.  Don’t skip this just because you think that the problem is simple.  The SnapCharT® forces you to ask good questions and makes sure you aren’t missing anything.  The simple problem may have aspects that you would have missed without fully using this technique.  In this example, maybe you find that this door is different than other doors, which have latches to hold them open, or handles to make it easier to open the door.  Imagine that this door might have been a bathroom stall door.  It would probably be set up differently than doors / hatches in other parts of the ship.

So, what are your Causal Factors?  First, I probably would not consider the sudden movement of the ship as a CF.  Remember, the definition of a CF states that it is a mistake or an error that directly leads to the incident. In this case, I think that it is expected that a ship will pitch or roll while underway; therefore, this would not be a CF. It is just a fact. This would be similar to the case where, in Alaska, someone slipped on a snow-covered sidewalk. I would not list that “it was snowing” as a CF.  This is an expected event in Alaska. It would not be under Natural Disaster / Sabotage, either, since snow is something I should be able to reasonably protect against by design.

In this case, I would consider the pitch / roll of the vessel as a normal occurrence.  There is really nothing wrong with the vessel rolling. The only time this would be a problem is if we made some mistake that caused an excessive roll of the vessel, causing the door to unexpectedly slam shut in spite of our normal precautions. If that were the case, I might consider the rolling of the ship to be a CF.  That isn’t the case in this example.

You would probably want to look at 2 other items that come to mind:

1.  Why did the door go shut, in spite of the vessel operating normally?
If we are on a vessel that is expected to move, our doors should probably not be allowed to swing open and shut on their own. There should be latches / shock absorbers / catches that hold the door in position when not being operated. Also, while the door is actually being operated, there should be a mechanism that does not depend on the operator to hold it steady while using the door. I remember on my Navy vessel all of our large hatches had catches and mechanisms that held the doors in place, EXCEPT FOR ONE HEAVY HATCH. We used to tell everyone to “be careful with that hatch, because it could crush you if we take a roll.” We had several injuries to people going through that hatch in rough seas. Looking back on that, telling people to “be careful” was probably not a very strong safeguard.

Depending on what you find here, the root causes for this could possibly be found under Human Engineering, maybe “arrangement/placement”, “tools/instruments NI”, excessive lifting/force”, “controls NI”, etc.

2. Why did the employee have his hand in a place that could cause the door to catch his hand?
We should also take a look to understand why the employee had his hand on the door frame, allowing the door to catch his finger.  I am not advocating, “Tell the employee to be careful and do not put your hand in possible pinch points.” That will not work too well. However, you should take a look and see if we have sufficient ways of holding the door (does it have a conventional door knob? Is it like a conventional toilet stall, with no handle or method of holding the door, except on the edge?). We might also want to check to see if we had a slippery floor, causing the employee to hold on to the edge of the door / frame for support. Lots of possibilities here.

Another suggestion: Whenever I have what I consider a “simple” mistake that I just can’t seem to understand (“How did the worker just fall down the stairs!?”), I find that performing a Critical Human Action Profile (CHAP) can be helpful.  This tool helps me fully understand EXACTLY what was going on when the employee made a very simple yet significant mistake.

TapRooT® works really well when you are trying to understand “simple” mistakes.  It gets you beyond telling the employee to be more careful next time, and allows you to focus on more human performance-based root causes and corrective actions that are much more likely to prevent problems in the future.

How many investigations are enough?

Posted: April 16th, 2018 in Investigations, Root Causes, TapRooT

 

I’d like you to think about this scenario at work.  You’ve just sent your team to Defensive Driving School, and made sure they were trained and practiced on good driving skills.  They were trained on how to respond when the vehicle is sliding, safe following distances, how to respond to inclement weather conditions, etc.

Now that they’re back at work, how many managers would tell their recently-trained employees, “I’m glad we’ve provided you with additional skills to keep yourself safe on those dangerous roads.  Now, I only want you to apply that training when you’re in bad weather conditions.  On sunny days, please don’t worry about it.”  Would you expect them to ONLY use those skills when the roads are snow-covered?  Or ONLY at rush hour?  I think we would all agree that this would be a pretty odd thing to tell your team!

Yet, that’s what I often hear!

I teach TapRooT® courses all over the world. We normally start off the class by asking the students why they’re at the course and what they are expecting to get from the class. I often hear something that goes like this:

“I’m here to get a more structured and accurate root cause analysis process that is easy for my staff to use and gets repeatable results.  I don’t expect to use TapRooT® very often because we don’t have that many incidents,  but when we do, we want to be using a great process.”

Now, don’t get me wrong, I appreciate the sentiment (we don’t expect to have many serious incidents at our company), and we can definitely meet all of the other criteria.  However, it does get a little frustrating to hear that some companies are going to reserve using this fantastic product to only a few incidents each year.  Doesn’t that seem to be a waste of terrific training?  Why would we only want our employees to use their training on the big stuff, but not worry about using that same great training on the smaller stuff?

There are a couple of reasons that I can think of that we have this misconception on when to use TapRooT®:

  • Some managers honestly believe that they don’t have many incidents.  Trust me, they are not looking very hard! Our people (including ourselves) are making mistakes every day.  Wouldn’t it be nice if we went out there, found those small mistakes, and applied TapRooT® to find solid root causes and corrective actions to fix those small issues before they became large incidents?
  • Some people think that it takes too long to do a good RCA.  Instead, they spend time using an inferior investigation technique on smaller problems that doesn’t fix anything anyway.  If you’re going to take time to perform some type of RCA, why waste any time at all on a system that gives you poor results?
  • Some people don’t realize that all training is perishable.  Remember those defensive driving skills?  If you never practice them, do you ever get good at them?

I recognize that you can’t do an RCA on every paper cut that occurs at your facility.  Nobody has the resources for that.  So there must be some level of “incident” at which makes sense to perform a good analysis.  So, how do we figure out this trip point?

Here are some guidelines and tips you can follow to help you figure out what level of problem should be investigated using TapRooT®:

  • First of all, we highly recommend that your investigators perform one TapRooT® investigation at least every month.  Any longer than that, and your investigation skills start becoming dull.  Think about any skill you’ve learned.  “Use it, or lose it.”
    • Keep in mind that this guideline is for each investigator.  If you have 10 investigators, each one should be involved in using TapRooT® at least monthly.  This doesn’t have to be a full investigation, but they should use some of the tools or be involved in an investigation at least every month.
  • Once you figure out how many investigations you should perform each year to keep your team proficient, you can then figure out what level of problem requires a TapRooT® investigation.  Here is an example.
    • Let’s say you have 3 investigators at your company.  You would want them to perform at least one investigation each month.  That would be about 36 investigations each year.  If you have about 20 first aid cases each year, that sounds like a good level to initiate a TapRooT® investigation.  You would update your company policy to say that any first aid case (or more serious) would require a TapRooT® investigation.
    • You should
      also do the same with other issues at the company.  You might find that your trigger points would be:

      • Any first aid report or above
      • Any reportable environmental release
      • Any equipment damage worth more than $100,000
    • When you add them all up, they might be about 36 investigations each year.  You would adjust these levels to match your minimum number to maintain proficiency.
  • At the end of each year, you should do an evaluation of your investigations.  Did we meet our goals?  Did each investigator only do 4 investigations this year?  Then we wasted some opportunities.  Maybe we need to lower our trip points a bit.  Or maybe we need to do more audits and observations, with a quick root cause analysis of those audit results.  Remember, your goal is to have each investigator use TapRooT® in some capacity at least once each month.
  • Note that all of this should be specified in your company’s investigation policy.  Write it down so that it doesn’t get lost.

Performing TapRooT® investigations only on large problems will give you great results.  However, you are missing the opportunity to fix smaller problems early, before they become major issues.

TapRooT®: It’s not just for major issues anymore!

How Safe Must Autonomous Vehicles Be?

Posted: April 3rd, 2018 in Accidents, Human Performance, Investigations

Tesla is under fire for the recent crash of their Model X SUV, and the subsequent fatality of the driver. It’s been confirmed that the vehicle was in Autopilot mode when the accident occurred. Both Tesla and the NTSB are investigating the particulars of this crash.

PHOTO: PUBLISHED CREDIT: KTVU FOX 2/REUTERS.

I’ve read many of the comments about this crash, in addition to previous crash reports. It’s amazing how much emotion is poured into these comments. I’ve been trying to understand the human performance issues related to these crashes, and I find I must take special note of the human emotions that are attached to these discussions.

As an example, let’s say that I develop a “Safety Widget™” that is attached to all of your power tools. This widget raises the cost of your power tools by 15%, and it can be shown that this option reduces tool-related accidents on construction sites by 40%.  That means, on your construction site, if you have 100 incidents each year, you would now only have 60 incidents if you purchase my Safety Widget™.  Would you consider this to be a successful purchase?  I think most people would be pretty happy to see their accident rates reduced by 40%!

Now, what happens when you have an incident while using the Safety Widget™? Would you stop using the Safety Widget™ the first time it did NOT stop an injury? I think we’d still be pretty happy that we would prevent 40 incidents at our site each year. Would you still be trying to reduce the other 60 incidents each year? Of course. However, I think we’d keep right on using the Safety Widget™, and continue looking for additional safeguards to put in place, while trying to improve the design of the original Safety Widget™.

This line of thinking does NOT seem to be true for autonomous vehicles. For some reason, many people seem to be expecting that these systems must be perfect before we are allowed to deploy them. Independent reviews (NOT by Tesla) have shown that, on a per driver-mile basis, Autopilot systems reduce accidents by 40% over normal driver accident rates. In the U.S., we experience about 30,000 fatalities each year due to driver error. Shouldn’t we be happy that, if everyone had an autonomous vehicle, we would be saving 12,000 lives every year? The answer to that, you would think, would be a resounding “YES!” But there seems to be a much more emotional content to the answer than straight scientific data would suggest.

I think there may be several human factors in play as people respond to this question:

  1. Over- and under-trust in technology: I was talking to one of our human factors experts, and he mentioned this phenomena. Some people under-trust technology in general and, therefore, will find reasons not to use it, even when proven to work. Others will over-trust the technology, as evidenced by the Tesla drivers who are watching movies, or not responding to system warnings to maintain manual control of the vehicle.
  2. “I’m better than other drivers. Everyone else is a bad drive; while they may need assistance, I drive better than any autonomous gadget.” I’ve heard this a lot. I’m a great driver; everyone else is terrible. It’s a proven fact that most people have an inflated opinion of their own capabilities compared to the “average” person.” If you were to believe most people, each individual (when asked) is better than average. This would make it REALLY difficult to calculate an average, wouldn’t it?
  3. It’s difficult to calculate the unseen successes. How many incidents were avoided by the system? It’s hard to see the positives, but VERY easy to see the negatives.
  4. Money. Obviously, there will be some people put out of work as autonomous vehicles become more prevalent. Long-haul truckers will be replaced by autopilot systems. Cab drivers, delivery vehicle drivers, Uber drivers, and train engineers are all worried about their jobs, so they are more likely to latch onto any negative that would help them maintain their relevancy. Sometimes this is done subconsciously, and sometimes it is a conscious decision.

Of course, we DO have to monitor and control how these systems are rolled out. We can’t have companies roll out inferior systems that can cause harm due to negligence and improper testing. That is one of the main purposes of regulation and oversight.

However, how safe is “safe enough?” Can we use a system that isn’t perfect, but still better than the status quo? Seat belts don’t save everyone, and in some (rare) cases, they can make a crash worse (think of Dale Earnhardt, or a crash into a lake with a stuck seat belt). Yet, we still use seat belts. Numerous lives are saved every year by restraint systems, even though they aren’t perfect. How “safe” must an autonomous system be in order to be accepted as a viable safety device? Are we there yet? What do you think?

Construction’s Fatal Four – A Better Approach to Prevention

Posted: March 26th, 2018 in Accidents, Performance Improvement, Root Causes

In 2016, 21% of fatal injuries in the private sector were in the Construction industry as classified by the Department of Labor. That was 991 people killed in this industry (almost 3 people every day). Among these were the following types of fatality:

Falls – 384 (38.7%)
Struck by Object – 93 (9.4%)
Electrocutions – 82 (8.3%)
Caught-in/between – 72 (7.3%)

Imagine that. Eliminating just these 4 categories of fatalities would have saved over 630 workers in 2016.

Now, I’m not naive enough to think we can suddenly eliminate an entire category of injury or fatality in the U.S. However, I am ABSOLUTELY CERTAIN that, at each of our companies, we can take a close look at these types of issues and make a serious reduction in these rates. Simply telling our workers to “Be careful out there!” or “Follow the procedures and policies we give you” just won’t cut it.

NOTE: In the following discussion, when I’m talking about our workers and teammates, I am talking about ALL of us! We ALL violate policies and procedures every day. Don’t believe me? Take a look at the speedometer on your car on the way home from work tonight and honestly tell me you followed the speed limit all the way home.

As an example, take a look at your last few incident investigations. When there is an incident, one of the questions always asked is, “Did you know that you weren’t supposed to do that?” The answer is almost always, “Yes.” Yet, our teammates did it anyway.

Unfortunately, too many companies stop here. “Worker knew he should not have put his hand into a pinch point. Corrective action, Counseled the employee on the importance of following policy and remaining clear of pinch points.” What a completely useless corrective action! I’m pretty sure that the worker who just lost the end of his finger knows he should not have put his hand into that pinch point. Telling him to pay attention and be more careful next time will probably NOT be very effective.

If we really want to get a handle on these types of injuries, we must adopt a more structured, scientific strategy. I’d propose the following as a simple start:

1. Get out there and look! Almost every accident investigation finds that this has happened before, or that the workers often make this same mistake. If that is true, we should be getting out there and finding these daily mistakes.

2. To correct these mistakes, you must do a solid root cause analysis. Just yelling at our employees will probably not be effective. Remember, they are not bad people; they are just people. This is what people do. They try to do the best job they can, in the most efficient manner, and try to meet management’s expectations. We need to understand what, at the human performance level, allowed these great employees to do things wrong. THAT is what a good root cause analysis can do for you.

3. As in #2, when something bad DOES happen, you must do a solid RCA on those incidents, too. If your corrective actions are always:

  • Write a new policy or procedure
  • Discipline the employee
  • Conduct even MORE training

then your RCA methodology is not digging deep enough.

There is really no reason that we can’t get these types of injuries and fatalities under control. Start by doing a good root cause analysis to understand what really happened, and recognize and acknowledge why your team made mistakes. Only then can we apply effective corrective actions to eliminate those root causes. Let’s work together to keep our team safe.

Mechanical Seal Efficiency – Potential for huge savings

Posted: March 20th, 2018 in Equipment/Equifactor®

 

 

 

 

 

 

 

 

Here’s a great article by AESSEAL about potential energy savings by selecting the correct mechanical seal flush plan. It’s a pretty interesting read, and possibly an eye-opener for a company with many mechanical seals, especially in high-temperature processes.

 

Are you a Proficient TapRooT® Investigator?

Posted: March 19th, 2018 in Career Development, Investigations, Performance Improvement, TapRooT

 

 

 

 

 

 

 

I teach a lot of TapRooT® courses all over the world, to many different industries and departments.  I often get the same questions from students during these courses.  One of the common questions is, “How do I maintain my proficiency as a TapRooT® investigator?”

This is a terrific question, and one that you should think carefully on.  To get a good answer, let’s look at a different example.:

Let’s say you’ve been tasked with putting together an Excel spreadsheet for your boss.  It doesn’t have to be anything too fancy, but she did ask that you include pivot tables in order to easily sort the data in multiple ways.  You decide to do a quick on-line course on Excel to brush up on the newest techniques, and you put together a great spreadsheet.

Now, if your boss asked you to produce another spreadsheet 8 months from now, what would happen?  You’d probably remember that you can use pivot tables, but you’ve probably forgotten exactly how it works.  You’ll most likely have to relearn the technique again, looking back over your last one, or maybe hitting YouTube as a refresher.  It would have been nice if you had worked on a few spreadsheets in the meantime to maintain the skills you learned from your first Excel course.  And what happens if Microsoft comes out with a new version of Excel?

Performing TapRooT® investigations are very similar.  The techniques are not difficult; they can be used by pretty much anyone, once they’ve been trained.  However, you have to practice these skills to get good at them and maintain your proficiency.  When you leave your TapRooT® course, you are ready to conduct your first investigation, and those techniques are still fresh.  If you wait 8 months before you actually use TapRooT®, you’ll probably need to refresh your skills.

In order to remain proficient, we recommend the following:

  • Obviously, you need to attend an initial TapRooT® training session.  We would not recommend trying to learn a technique by reading a book.  You need practice and guidance to properly use any advanced technique.
  • After your class, we recommend you IMMEDIATELY go perform an investigation, probably within the next 2 weeks or so.  You need to quickly use TapRooT® in your own work environment.  You need to practice it in your own conference room, know where your materials will be kept, know who you’re going to contact, etc.  Get the techniques ingrained into your normal office routine right away.
  • We then recommend that you use TapRooT® at least every month.  That doesn’t necessarily mean that you must perform a full incident investigation monthly, but maybe just use a few of the techniques.  For example, you could perform an audit and run those results though the Root Cause Tree®.  Anything to keep proficient using the techniques.
  • Refresher training is also a wonderful idea.  We would recommend attending a refresher course every 2 years to make sure you are up to speed on the latest software and techniques.  If you’ve attended a 2-Day TapRooT® course, maybe a 5-Day Advanced Team Leader Course would be a good choice.
  • Finally, attending the Annual Global TapRooT® Summit is a great way to keep up to speed on your TapRooT® techniques.  You can attend a specialized Pre-Summit course (Advanced Trending Techniques, or Equifactor® Equipment Troubleshooting, or maybe an Evidence Collection course), and then attend a Summit track of your choosing.

There is no magic here.  The saying, “Use it, or Lose it” definitely applies!

Miami Bridge Collapse – Is Blame Part of Your Investigation Policy?

Posted: March 16th, 2018 in Accidents, Investigations

collapse miami bridge

 

 

 

 

 

 

 

I was listening to a news report on the radio this morning about the pedestrian bridge collapse in Miami. At one point, they were interviewing Florida Governor Rick Scott.  Here is what he said:

“There will clearly be an investigation to find out exactly what happened and why this happened…”

My ears perked up, and I thought, “That sounds like a good start to a root cause investigation!”

And then he continued:

“… and we will hold anybody accountable if anybody has done anything wrong,”

Bummer.  His statement had started out so good, and then went directly to blame in the same breath.  He had just arrived on the scene.  Before we had a good feel for what the actual circumstances were, we are assuming our corrective actions are going to pinpoint blame and dish out the required discipline.

This is pretty standard for government and public figures, so I wasn’t too surprised.  However, it got me thinking about our own investigations at our companies.  Do we start out our investigations with the same expectations?  Do we begin with the good intentions of understanding what happened and finding true root causes, but then have this expectation that we need to find someone to blame?

We as companies owe it to ourselves and our employees to do solid, unbiased incident investigations.  Once we get to reliable root causes, our next step should be to put fixes in place that answer the question, “How do we prevent these root causes from occurring in the future?  Will these corrective actions be effective in preventing the mistakes from happening again?”  In my experience, firing the employee / supervisor / official in charge rarely leads to changes that will prevent the tragedy from happening again.

 

Hire a Professional

Posted: March 12th, 2018 in Accidents, Career Development Tips, Performance Improvement, Training

root cause analysis, RCA, investigation

I know every company is trying to do the best they can with the resources that are available. We ask a lot of our employees and managers, trying to be as efficient as we can.

However, sometimes we need to recognize when we need additional expertise to solve a particular problem. Or, alternatively, we need to ensure that our people have the tools they need to properly perform their job functions.  Companies do this for many job descriptions:

  • Oil analyst
  • Design engineer
  • Nurse
  • Aircraft Mechanic

I don’t think we would ask our Safety Manager to repair a jet engine.  THAT would be silly!

However, for some reason, many companies think that it is OK to ask their aircraft mechanics to perform a root cause analysis without giving them any additional training.  “Looks like we had a problem with that 737 yesterday.  Joe, go investigate that and let me know what you find.”  Why would we expect Joe, who is an excellent mechanic, to be able to perform a professional root cause analysis without being properly trained?  Would we send our Safety Manager out to repair a jet engine?

It might be tempting to assume that performing an RCA is “easy,” and therefore does not require professional training.  This is somewhat true.  It is easy to perform bad RCA’s without professional training.  While performing effective  investigations does not require years of training, there is a certain minimum competency you should expect from your team, and it is not fair to them to throw them into a situation which they are not trained to handle.

Ensure you are giving your team the support they need by giving them the training required to perform excellent investigations.  A 2-Day TapRooT® Essential Techniques Course is probably all your people will need to perform investigations with terrific results.

 

Equifactor and FMEA

Posted: March 12th, 2018 in Equipment/Equifactor®

equifactor, repair, FMEA

 

 

 

 

 

 

 

 

For those of you that have met me, you know that I am a huge fan of proactive improvement processes. Why wait until something bad happens to fix your issues? Wouldn’t it be nice if we could fix problems before we have an incident that actually hurts someone, or damages our equipment?  I’ve spoken numerous times about using TapRooT® proactively for HSEQ problems, but I wanted to give you a tool to help you with your proactive equipment troubleshooting.

Design and process engineers are usually familiar with Failure Modes and Effects Analysis, or FMEA.  This is a generic tool that can be used to look at a piece of equipment or a process, identify what can go wrong, and determine more stringent controls should be put in place to prevent that failure.  There are actually quite a few ways to do this, but most FMEA’s are all based on a fairly standard format.  For this discussion, I’m going to focus on equipment failures.  Generally, the system walks you through several distinct steps:

  1. Identify the piece of equipment you wish to analyze.
  2. Look at all realistic potential failure modes that can occur with that equipment.
  3. Assign a Severity, Occurrence, and Detectability score to each failure
  4. Multiply these scores together to calculate a Risk Priority Number (RPN).
  5. Determine the controls that are currently in place to prevent this issue.
  6. Decide if additional controls are required, based on the RPN.

Now, looking at these steps, it occurs to me that many of these steps are somewhat subjective.  For example on a scale of 1-10, what is the Severity of the failure?  Most companies have put a matrix in place to help quantify these numbers and make it easier to come up with consistent results.  This guidance is really important if you want to have any kind of meaningful, systematic way of determining that RPN.  While not perfect, these matrices do a pretty good job of keeping everyone focused and getting consistent answers.

However, the one step that is still VERY subjective is step #2.  Somehow, you need to come up with a list of all the potential failure modes that your piece of equipment can experience.  This is the very basis of the entire analysis, and it is probably the most difficult.  Imagine telling your maintenance manager or design engineer, “Tell me all the ways this compressor can fail.”  While I’m sure your team is pretty sharp, this is a daunting task.  Ideally, they will need to list every possible failure mode to ensure we don’t miss anything.  Imagine how many “unknown unknowns” are floating around in our FMEA’s!

Wouldn’t it be nice if there were some compendium of possible failures that we could use to initially populate our FMEA list?  This is where I would recommend pulling up your Equifactor® tables.  Take a look at (for example) the Centrifugal Compressor troubleshooting tables.  Just in this category alone, we have nearly 50 possible failure modes, spread across 7 symptoms.  Imagine if you could start your FMEA with all of these items.  You’d be well on your way to conducting a detailed FMEA on your centrifugal compressors, with the ability to add a few more failure modes that may be unique to your situation.

We normally think about Equifactor® as a reactive troubleshooting tool.  While it excels in that mode, try using the Equifactor® tables more proactively.  Use those tables as the baseline for your FMEA, and limit the number of unknown issues that may be lurking in your equipment.

 

Keeping Your Equipment Reliability Team Sharp

Posted: March 7th, 2018 in Career Development, Equipment/Equifactor®, Summit

study, read, reliability, troubleshoot

 

 

 

 

 

 

 

We have just completed our annual Global TapRooT® Summit, and we all walked away with some terrific ideas to bring back to our companies. Many people think the Summit is only for our customers to improve their processes, but I ALWAYS come away with new ideas for myself.

Heinz Bloch was one of our speakers this year.  He had 2 excellent sessions on how equipment reliability is tied directly to your company’s bottom line.  As always, he had some great insights into how a company can integrate reliability techniques into their business model for real, measurable savings.

One of his observations is that, as technology progresses, it is imperative that your reliability and maintenance team  keep up-to-date on the current best practices and technologies.  It is too easy to assume your excellent reliability and maintenance engineers will just magically remain top-notch.  His suggestion (almost a demand!) was to ensure we give our team the time and motivation to actually READ about their craft.  Your team should be allocating some amount of time EVERY DAY to reading professional journals and articles to see what is happening outside their own company boundaries.

  • Are you using the very best lubricant?
  • What new bearing materials are available for your applications?
  • How much can we save by investing in slightly more expensive, but much more efficient technology?
  • What are our competitors using for condition-based maintenance?

As managers, we should be giving our team both the time and the incentive to read these journals and articles.  Trust me, your competition is doing this; don’t be left behind!

‘Equipment Failure’ is the cause?

Posted: February 22nd, 2018 in Accidents, Equipment/Equifactor®, Investigations
Fire, equipment. failure

Drone view of tank farm fire Photo: West Fargo Fire Department

 

 

 

 

 

 

 

On Sunday, there was a diesel fuel oil fire at a tank farm in West Fargo, ND. About 1200 barrels of diesel leaked from the tank.  The fire appears to have burned for about 9 hours or so.  They had help from fire dapartments from the local airport and local railway company, and drone support from the National Guard.  There were evacuations of nearby residents.  Soil remediation is in progress, and operations at the facility have resumed.  Read more about the story here.

The fire chief said it looks like there was a failure of the piping and pumping system for the tank. He said that the owners of the tank are investigating. However, one item caught my attention. He said, “In the world of petroleum fires, it wasn’t very big at all. It might not get a full investigation.”

This is a troublesome statement.  Since it wasn’t a big, major fire, and no one was seriously hurt, it doesn’t warrent an investigation.  However, just think of all the terrific lessons learned that could be discovered and learned from.  How major a fire must it be in order to get a “full investigation?”

I often see people minimize issues that were just “equipment failures.”  There isn’t anyone to blame, no bad people to fire, it was just bad equipment.  We’ll just chalk this one up to “equipment failure” and move on.  In this case, that mindset can cause people to ignore the entire accident, and that determining it was equipment failure is as deep as we need to go.

Don’t get caught in this trap.  While I’m sure the tank owner is going to go deeper, I encourage the response teams to do their own root cause analyses to determine if their response was adequate, if notifications correct, if they had reliable lines of communications with external aganecies, etc.  It’s a great opportunity to improve, even if it was only “equipment failure,” and even if you are “only” the response team.

Tips on Preventing Vibration in Rotating Equipment

Posted: February 13th, 2018 in Equipment/Equifactor®

Pump

Improper installation of rotating equipment, either for the first time or after maintenance, can quickly lead to excessive vibration and premature failure. Here are some tips on the proper use of shims and alignment equipment.

Equipment Troubleshooting in the Future

Posted: January 5th, 2018 in Equipment/Equifactor®, Investigations

Equipment Troubleshooting in the Future
By Natalie Tabler and Ken Reed

If you haven’t read the article by Udo Gollub on the Fourth Industrial Revolution, take some time to open this link. This article can actually be found at many links on the internet, so attribution is not 100% certain, but Mr. Gollub appears to be the probable author.

The article is interesting. It discusses a viewpoint that, in the current stage of our technological development, disruptive technologies are able to very quickly change our everyday technological expectations into “yesterday’s news.” What we consider normal today can be quickly overtaken and supplanted by new technology and paradigms. While this is an interesting viewpoint, one of the things I don’t see discussed is one of the most common problems with automating our society: equipment failure. If our world will largely depend on software controlling machinery, then we need to take a long hard look at avoiding failure not only in the manufacturing process, but also in the software development process.

The industrial revolution that brought us from an agricultural society to an industrial one also brought numerous problems along with the benefits. Changing how the work is done (computerization vs. manual labor) does not change human nature. The rush to be first to come out with a product (whether it be new software or a physical product) will remain inherent in the business equation, and with it the danger of not adequately testing, or overly optimistic expectations of benefit and refusal to admit weaknesses.

If we are talking about gaming software – no big deal. So, getting to the next level of The Legend of Zelda – Breath of the Wind had some glitches; that can be changed with the next update. But what if we are talking about self-driving cars or medical diagnostic equipment? With no human interaction with the machine (or software running it) the results could be catastrophic. And what about companies tempted to cut some corners in order to bolster profits (remember the Ford Pinto, Takata airbags, and the thousands of other recalls that cost lives)? Even ethical companies can produce defective products because of lack of knowledge or foresight. Imagine if there were little or no controls in production or end use.

Additionally, as the systems get more complex, the probability of unexpected or unrecognized error modes will also increase at a rapid rate. The Air France Flight 447 crash is a great example of this.

So what can be done to minimize these errors that will undoubtedly occur? There are really 2 options:

1. Preventative, proactive analysis safety and equipment failure prevention training will be essential as these new technologies evolve. This must also be extended to software development, since it will be the driving force in new technologies production. If you wonder how much failure prevention training is being used in this industry, just count the number of updates your computer and phone software sends out each year. And yes, failure prevention should include vigilance on security breaches. A firm understanding of human error, especially in the software and equipment design phase, is essential to understanding why an error might be introduced, and what systems we will need in place to catch or mitigate the consequences of these errors.  This obviously requires effective root cause analysis early in the process.

2. The second option is to fully analyze the results of any errors after they crop up. Since failures are harder to detect as stated in #1, it becomes even more critical that, when an error does cause a problem, we dig deep enough to fix the root cause of the failure. It will not be enough to say, “Yes, that line of code caused this issue. Corrective action: Update the line of code.” We must look more deeply into how we allowed the errant line of code to exist, and then do a rigorous generic cause analysis to see of we have this same issue elsewhere in our system.

With the potential for rapidly-evolving hardware and software systems causing errors, it will be incumbent on companies to have rigorous, effective failure analysis to prevent or minimize the effects of these errors.

Want to learn more about equipment troubleshooting? Attend our Special 2-Day Equifactor® Equipment Troubleshooting and Root Cause Analysis training February 26 and 27, 2018 in Knoxville, Tennessee and plan to stay for the 2018 Global TapRooT® Summit, February 28 to March 2, 2018.

Root Cause Tip: Causal Factor Development

Posted: January 4th, 2018 in Root Cause Analysis Tips, Summit

Error, mistake, Causal factor

Human Error?

 

Hi, everyone.

I thought I’d do a quick discussion on some ideas to help you when developing Causal Factors on your SnapCharT®.

Let me start out by stressing the importance of using the definition of a Causal Factor (CF) when you are looking at your SnapCharT®. Remember, a Causal Factor is a mistake, error, or failure that, if corrected would have prevented the incident, or mitigated it’s consequences.  The most important part of the definition are the first few words:  mistake, error, or (equipment) failure.  As you are looking for CFs, you should be looking for human error or mistakes that led directly to the incident.  Remember, we aren’t blaming anyone.  However, it is important to realize that almost all incidents are “caused” by someone not doing what they were supposed to do, or doing something they shouldn’t.  This isn’t blame; this is just a recognition that humans make mistakes, and our root cause analysis must identify these mistakes in order to find the root causes of those mistakes.

With this definition in mind, let’s talk about what is NOT a CF.  Here are some examples:

  • “The operator did not follow the procedure.”  While this may seem like a CF, this did not lead directly to the incident.  We should ask ourselves, “What mistake was made because someone did not follow the procedure?”  Maybe, the operator did not open the correct valve.  Ah, that sounds like a mistake that, if it had not occurred, I probably would not have had the incident.  Therefore, “Operator did not open valve VO-1” is probably the CF.  Not following the procedure is just a problem that will go under this CF and describe the actual error.
  • “Pre-job brief did not cover pinch points.”  Again, we should ask ourselves, “What mistake was made because we did not cover pinch points in our pre-job brief?”  Maybe the answer is, “The iron worker put his hand on the end of the moving I-beam.”  Again, this is the mistake that led directly to the incident.  The pre-job brief will be a piece of information that describes why the iron worker put his hand in the pinch point.
  • “It was snowing outside.”  I see this type of problem mis-identified as a CF quite often.  Remember, a CF is a mistake, error, or equipment failure.  “Snowing” is not a mistake; it is just a fact.  The mistake that was made because it was snowing (“The employee slipped on the sidewalk”) might be the CF in this case, again with the snowy conditions listed under that CF as a relevant piece of data.

Hopefully, this makes it a little easier to identify what is and is not a CF.  Ask yourself, “Is my Causal Factor a mistake, and did that mistake lead directly to the incident?”  If not, you can then identify what actually lead to the incident.  This is your CF.

Want to learn more? Attend our 2-day Advanced Causal Factor Development course February 26 and 27, 2018 in Knoxville, Tennessee and plan to stay for the 2018 Global TapRooT® Summit, February 28 to March 2, 2018.

Asset Optimization Track at the Summit – Are you in?

Posted: January 2nd, 2018 in Equipment/Equifactor®, Summit

If you are a typical procrastinator, you probably haven’t signed up for the Global TapRooT® Summit yet. Well, it’s now 2018, and procrastination is over! Don’t miss the chance to get in on one of the best opportunities to improve your company.

I wanted to give you a little more information about one of the most exciting tracks at the Summit: the Asset Optimization Track (doesn’t that sound so much better than “Equipment Fixing Track”?) We’ll be providing some great sessions to get your reliability and maintenance processes moving in the right direction in 2018. Besides the Summit Keynote addresses common to all tracks, here are a few of the sessions that make the Asset Optimization track right for you:

  • Multiple Failures Without Learning:  Tired of repeat failures?  Chris Vallee will give you some examples of why the “break it – Fix it” mentality is no longer the way to go.
  • Improved maintenance Troubleshooting using Equifactor®:  I’ll be leading a session on how Equifactor® Equipment Troubleshooting module of the TapRooT® VI software service works, including some of the new features available in the software.  For some reason, I think this is the best session of all!
  • The Business End of Equipment Reliability:  OK, maybe my session is NOT the best!  Heinz Bloch will be with us again this year, giving us his unique insight into the business case behind reliably operating your equipment.
  • The Psychology of Failing Fixes:  Kevin McManus will be sharing his unique view on how root cause analysis is the key to a truly reliable operation.  Don’t miss the opportunity to soak up some of Kevin’s enthusiasm!

We’re going to have a great Summit this year, in our hometown for the first time ever.  Please plan on signing up for the Summit.  Don’t tell yourself, “Oh, yeah, I need to do that sometime.”  The time is now; the Summit is next month! REGISTER NOW for the Asset Optimization Track!

Can Regulators Use TapRooT® Investigation Tools?

Posted: August 15th, 2017 in Investigations, Performance Improvement

Regulator Inspection Investigation

I had a question recently from one of our friends who works as a regulator in his country. He was wondering about the advantages of using TapRooT® as a regulator as opposed to an industry user. I think this is a great question.  We often think about doing incident investigations for ourselves, but how do you help those you oversee as a regulating body?

As a government agency, you have great potential to affect the safety and health of both your employees and those you oversee.

  • Just attending the TapRooT® training will give your staff the basic understanding of true, human-performance based root causes.  It gives your team a new perspective on why people make poor decisions, and just as importantly, why people make good decisions.  This understanding will guide your thinking as to why problems occur.   Once this perspective is clear, your team will no longer be tempted to just blame the individual for problems.  They will think more deeply about the organizational issues that are causing people to make bad decisions.
  • The training will give you the tools to perform accurate, consistent investigations.  You can have confidence in knowing that your team has discovered not one or 2 issues, but all the problems that led to an incident.
  • Your investigations and investigation report reviews using TapRooT® will be based on human performance expertise, helping to eliminate your team’s biases.  EVERYONE has biases, and using TapRooT® helps keep you focused on the true reasons people make mistakes.
  • You will also have the tools to be able to more accurately assess the adequacy of the investigations and corrective actions that are submitted to you by those you oversee.  You can see where they are doing good investigations, and where they probably need to improve.  The corrective actions that are suggested by those you oversee are often poorly written and do not address the real reasons for the incident.  The TapRooT® training will ensure you are seeing effective corrective actions.
  • If your agency conducts trending of the their results, you’ll be able to produce consistent, trendable data from your investigations.  If you ensure your industry constituents are also using TapRooT®, the data you receive from them will also allow for more accurate trending results.
  • Finally, you can use the TapRooT® tools learned during the course to perform proactive audits of your industry partners.  When you perform onsite inspections, you can ensure you are looking for the right problems, and assigning effective corrective actions for the problems encountered.  Instead of just looking for the same problems, the tools allow you to look deeper at the processes you are inspecting to find and correct potential issues before they become incidents.

TapRooT® gives you confidence that the results of your investigations, and those of those you oversee, result in fixable root causes and effective corrective actions.

Root Cause Analysis Tip: 3 Tips for Drawing a Better SnapCharT®

Posted: March 15th, 2017 in Root Cause Analysis Tips

 

0e93471

Visualize each step of an incident with a SnapCharT®.

It’s nearly impossible to conduct a useful root cause analysis unless you actually have some data to analyze. Many systems seem to think that you can dive right into an analysis before you have a full understanding of what actually happened. During the development of the TapRooT® System, one of the first items of business was to develop an easy way to visualize the problem and document the gathered facts. Thus, SnapCharT® was born.

SnapCharT®s are pretty easy to build. With just three shapes to worry about, and a few simple rules, the SnapCharT® gets you moving in the right direction right from the get-go.

Here are a few tips to help make the SnapCharT® even easier and more useful.

1. Avoid the word “and” in your Events. Events are meant to show a single action that occurred in the course of the incident investigation. Some people have an aversion to having a bunch of Events, and therefore put several actions in each one.  For example, if I wanted to document that the driver stopped at the stop sign, looked both ways, and then pulled out into the intersection, I would not want to write this as a single Event.  This should be 3 separate (short) Events, one after the other.

The reason this is important is because we want to see if any mistakes are made during each step in the sequence of events.  If we put several actions into a single Event, we find it is easy to miss one of these mistakes.  On the other hand, with 3 separate Events, I can ask, “Did the driver make a mistake while stopping?  Did she make a mistake while looking both ways?  Did she make a mistake by pulling forward?”  Having separate Events makes it much easier to catch individual problems.

Keep in mind that, later in the investigation, you may find that there were no mistakes made in any of these Events.  When you complete your SnapCharT®, it might then make sense to combine some Events to make the final SnapCharT® easier to read.  It is OK to combine Events later on; just leave them separate during your initial data-gathering phase.

2. Leave lots of space.  Many people tend to cram all their Events close together, I suppose to conserve real estate.  Don’t worry about it; leave lots of room between your individual Events.  Spread everything out.  You’ll be adding Conditions underneath each of these Events, and you’ll almost certainly end up moving everything to make room for these Conditions anyway.  Give yourself plenty of room to work at the beginning.  If using the software, I usually only put 2 or 3 Events on each page to start out.  Later on, once you have all of your Conditions documented and grouped, you can compress everything down a bit and get rid of extra spaces.  But even then, don’t try to squeeze everything tightly together.  It can make it hard to read, even after everything is set.  And you might also find new Conditions that need to be added once you start the root cause analysis.

3. Draw your lines at the very end.  It is tempting to start drawing lines early in the process.  You want to see those arrows showing your progression from one Event to the next.  And you want to arrange your Conditions into neat groups right from the start.  Unfortunately, this can cause problems later on.  There is a good chance you’ll be adding new Events, changing the order of the Events you have, or regrouping your Conditions into Causal Factor groups.  If you have already drawn your lines, you’ll just have to delete them, make your changes, and then draw them back in.  And then probably do it again later on.

I normally don’t draw any lines between Events or Conditions until after I’ve identified my Causal Factor groups.  My SnapCharT® is probably pretty close to being complete by that point, so I’m reasonably confident that I won’t be making a lot of changes.  This can be a tough lesson for those that are REALLY detail oriented (you know who you are!), and just have to have those lines drawn in early in the process.  Resist the temptation; it’ll save you some time (and frustration!) later on.

Let me know what you think about these tips.  If you have other tips that you’ve found that make it easier and quicker to produce your SnapCharT®s, share the best practices you’ve learned in the comments below.

We hope that you will also consider coming to the 2016 Global TapRooT® Summit, San Antonio, Texas, August 1-5 to share best practices.  Click here to learn more about the Summit.

 

Carnival Pride NTSB Allision Report – Causal Factor Challenge

Posted: March 7th, 2017 in Accidents, Investigations

collision, allision, carnival

The NTSB released their report on the allision of the Carnival Pride cruise ship with the pier in Baltimore last may. It caused over $2 million in damages to the pier and the ship, and crushed several vehicles when the passenger access gangway collapsed onto them. Luckily, no one was under or on the walkway when it fell.  You can read the report here.

Pride

The report found that the second in command was conning the ship at the time.  He had too much speed and was at the wrong angle when he was approaching the pier.  The report states that the accident occurred because the captain misjudged the power available when shifting to an alternate method of control to stop the ship.  It states there may have been a problem with the controls, or maybe just human error.  It also concluded that the passenger gangway was extended into the path of the ship, and that it did not have to be extended until ready for passengers to debark.

collision, allision, carnival

Gangway collapse after allision

While I’m sure these findings are true, I wonder what the actual root causes would be?  If the findings are read as written, we are really only looking at Causal Factors, and only a few of those to boot.  Based on only this information, I’m not sure what corrective actions could be implemented that would really prevent this in the future.  As I’m reading through the report, I actually see quite a few additional potential Causal Factors that would need to be researched and analyzed in order to find real root causes.

YOUR CHALLENGES:

  1. Identify the Causal Factors you see in this report.  I know you only have this limited information, but try to find the mistakes, errors, or equipment failures that lead directly to this incident (assuming no other information is available)
  2. What additional information would you need to find root causes for the Causal Factors you have identified?
  3. What additional information would you like in order to identify additional Causal Factors?

Reading through this incident, it is apparent to me that there is a lot of missing information.  The problems identified are not related to human performance-based root causes; there are only a few Causal Factors identified.  Unfortunately, I’m also pretty sure that the corrective actions will probably be pretty basic (Train the officer, update procedure, etc.).

BONUS QUESTION:

For those that think I spelled “collision” wrong, what is the meaning of the word “allision”?  How many knew that without using Google?

Avoid the Danger of New Hires

Posted: March 1st, 2017 in Accidents, Current Events, Performance Improvement

 

Is your safety program ready?

Is your safety program ready?

There is a feeling of cautious optimism in the oil sector, as the price of oil seems to have stabilized above $50/barrel. Rig count in the Permian has more than doubled since last spring. US EIA and JPMorgan are forecasting US production at near record levels of over 9.5 million barrels per day by the end of next year. US exports are up, with China ramping up oil purchases from the US, while OPEC production cuts are holding.

This all sounds good for the US oil sector. It is expected that hiring will start picking up, and in fact Jeff Bush, president of oil and gas recruiting firm CSI Recruiting, has said, “When things come back online, there’s going to be an enormous talent shortage of epic proportions.”

So, once you start hiring, who will you hire? Unfortunately, much of the 170,000 oil workers laid off over the past couple of years are no longer available. That experience gap is going to be keenly felt as you try to bring on new people. In fact, you’re probably going to be hiring many people with little to no experience in safe operation of your systems.

Are you prepared for this? How will you ensure your HSE, Quality, and Equipment Reliability programs are set up to handle this young, eager, inexperienced workforce? What you certainly do NOT want to see are your new hires getting hurt, breaking equipment, or causing environmental releases. Here are some things you should think about:

– Review old incidents and look for recurring mistakes (Causal Factors). Analyze for generic root causes. Conduct a TapRooT® analysis of any recurring issues to help eliminate those root causes.
– Update on-boarding processes to ensure your new hires are receiving the proper training.
– Ensure your HSE staff are prepared to perform more frequent audits and subsequent root cause analysis.
– Ensure your HSE staff are fully trained to investigate problems as they arise.
– Train your supervisors to conduct audits and detailed RCA.
– Conduct human factors audits of your processes. You can use the TapRooT® Root Cause Tree® to help you look for potential issues.
– Take a look at your corrective action program. Are you closing out actions? Are you satisfied with the types of actions that are in there?
– Your HSE team may also be new. Make sure they’ve attended a recent TapRooT® course to make sure they are proficient in using TapRooT®.

Don’t wait until you have these new hires on board before you start thinking about these items. Your team is going to be excited and enthusiastic, trying to do their best to meet your goals. You need to be ready to give them the support and tools they need to be successful for themselves and for your company.

TapRooT® training may be part of your preparation.  You can see a list of upcoming courses HERE.

Simple Root Cause Analysis (Don’t Settle!)

Posted: February 23rd, 2017 in Root Cause Analysis Tips, TapRooT, Training, Uncategorized

 

RCA, Root Cause analysis, 5-why, 5-whys
OK, show of hands:

How many companies are using TapRooT® for their “hard,” “high-risk” incident analyses and using something like 5-Whys for the “simple” stuff?  Yep, I thought so.  A lot of companies are doing this for various reasons. I’ll get into that more in a minute.

Now, another poll:

How many of you are performing effective root cause analyses on your “important,” “high-consequence” investigations, and performing nearly useless analyses on the “easy” stuff?  Of course, you know this is really exactly the same question, but you’re not as comfortable raising your hand the second time, are you?

Those of you that follow this blog have already read why using inferior RCA methods don’t work well, but let me recap.  I’m going to talk about 5-Whys specifically, but you can probably insert any of your other, less-robust analysis techniques here:

5-Whys

  • It does not use an expert system.  It relies on the investigator to know what questions to ask.
  • Because of this, it allows for investigator bias.  If you are a training person, you will (amazingly enough) end up with “training” root causes.
  • The process does not rely on human performance expertise.  Again, it relies on the skill of the investigator.  Yes, I know, we’re all EXCELLENT investigators!
  • It does not produce consistent results.  If I give the same investigation to 3 different teams, I always get 3 different sets of answers.
  • There is no assistance in developing effective corrective action.  When 80% of your corrective actions fall into the “Training” “Procedures” and “Discipline” categories, you are not really expecting any new results, are you?

So, knowing this to be true, why are we doing this?  Why are we allowing ourselves to knowingly get poor results?

  • These are low risk problems, anyway.  It doesn’t matter if we get good answers (Why bother, then?)
  • It’s quick.  (Of course, quickly getting poor results just doesn’t seem to be an effective use of your time.)
  • It’s easy (to get poor results).
  • TapRooT® takes too long.  Finally, an answer that, while not true, at least makes sense.

So what you’re really telling me is that if TapRooT® were just easier to use, you would be able to ditch those other less robust methods, and use TapRooT® for the “easy” stuff, too.

Guess what?  We’ve now made TapRooT® even easier to use!  The 7-step TapRooT® process can now be shortened for those “easy” investigations, and still get the excellent results you’re used to getting.

Simple RCA, TapRooT, root cause analysisWe now teach the normal 7-Step method for major incidents, where you need the optional data-collection tools.  However, we are now showing you how to use TapRooT® in low to medium-risk investigations.  You are still using the tools that make TapRooT® a great root cause analysis tool.  However, we show you how to shorten the time it takes to perform these less-complex analyses.

The 2-Day TapRooT® Incident Investigation Course concentrates on these low to medium-risk investigations.  The 5-Day TapRooT® Advanced Team Leader Course teaches both the simple method, but also teaches the full suite of TapRooT® tools.

Don’t settle for poor investigations, knowing the results are not what you need.  Take a look at the new TapRooT® courses and see how to use the system for all of your investigations.  You can register for one of these courses here.

Starting Your Investigations: The Power of the SnapCharT®

Posted: November 7th, 2016 in Investigations, Root Cause Analysis Tips

Beginning your investigation can sometimes be quite a challenge. Deciding on who to talk to, what documents you need, what questions you need to ask, etc. can lead to feeling slightly overwhelmed. As General Creighton Abrams said,

When eating an elephant, take one bite at a time.

In other words, you just need to get started with the first step, and then methodically work your way through to the end.

In TapRooT®, that first bite is the SnapCharT®. The rest of your investigation is going to depend on the data you gather in that SnapCharT®, so it is critical that you begin in a simple, methodical manner.

Let’s say you get that initial notification phone call (usually at 3:00 am). You don’t get much information. Maybe all you know is, “Ken, we had a pipe rupture this morning during a hydrostatic test. Looks like the mechanics didn’t know what they were doing.  They had hooked up a test pump to the piping, started the pump, and almost immediately ruptured the piping.  We’ve cleaned up the water, and no one was hurt.  We need you to investigate this.”  This is a pretty common initial report.  Not a lot of data, some opinions thrown in, and a request for answers.  Without a structured process, most investigations would now start off with some interviews, asking pretty generic questions.  It would be really nice if we could start off with some detailed, intelligent questions.

This is where the SnapCharT® comes in.  Once you receive that initial phone call, just build your SnapCharT® with the information you have.  It honestly won’t have much data, but that’s OK; it’s only your starting point:

Initial SnapCharT®

Initial SnapCharT®

However, with this initial SnapCharT®, it is now easier to visualize what you already know, and what you still need to know.  For example, I’d have a lot of questions about the pump, the mechanics themselves, recovery actions, etc.  I’d use the Root Cause Tree® to help me figure out what questions to ask.  I’d take each Event and ask, “What do I already know about this Event, and what questions do I have about it?”  These would all be added to the SnapCharT®.  It might look more like this:

Questions to ask

Questions to ask

Keep in mind that these questions were developed before I even went to the scene or questioned anybody about the facts.  I still need to interview people, but I now have a much better set of questions to begin my investigations.  Many more questions will arise as I ask this initial set of questions, but I’ll feel much better prepared to start talking to people about the issue.

The SnapCharT® is a simple yet effective tool to help the investigator get started with the investigation.  It may seem like an inconsequential step, easy to dismiss.  However, using the SnapCharT® as your very first tool, before you start gathering data, can greatly speed up the investigation.  It allows you to start on the right path, with a set of intelligent questions to ask.  Once you have this moving, you’ll find the rest of the investigation falls into place in a logical, easy to follow format.  ALWAYS START WITH A SNAPCHART®!

LEARN MORE about TapRooT® essentials in our 2-day course (View schedule and register!)

 

Equipment Reliability: What Happens as Pumps Wear Out?

Posted: October 11th, 2016 in Equipment/Equifactor®

Equipment reliability - Pump wear

When we are faced with the prospect of installing a new pump, we have to take a look at several factors to decide what the best course of action will be. For example, we have to look at:
– Fit for purpose
– Initial cost
– Life-cycle maintenance costs
– Electrical efficiency
– Ease of maintenance
– etc.

An additional consideration is how the characteristics of the pump vary over time.  It is fairly straight forward to calculate flow rates and pressures using the specs of a new pump.  However, how do these specs vary over time?  As the pump wears, how will the characteristics of the pump change, and how will this affect the overall fitness of the pump for the service environment?

Here is a nice article that describes how pump nameplate characteristics will change as the pump wears, and what to expect as the components wear.

Equipment Failure: Mechanical Seal Basics

Posted: October 3rd, 2016 in Equipment/Equifactor®

Mechanical seal

 

Modern pump systems are moving more and more away from traditional pump packing, and more towards mechanical seals.  There are many advantages to using a mechanical seal instead of pump packing.  However, using these seals brings along some additional potential problems.

Before we can look at these additional issues, we first need to make sure we understand exactly what we mean by a “mechanical seal.”  Here is a quick refresher on how these seals work.  Next week, we’ll look a little more deeply into the advantages and disadvantages of these systems.

Infection Control: Corrective Actions Much More Expensive then Proactive Improvement

Posted: October 3rd, 2016 in Medical/Healthcare

Infection 2

Here’s a story about a healthcare facility who has agreed to hire an infectious control consultant as part of an agreement to fix problems found by regulators.

What I found interesting is that the original inspection found “11 years of misconduct that led to the contamination of surgical instruments, among other issues.” What this really tells me is that no one was looking at normal day-to-day practices at the center. If there had been a robust audit and observation program, they probably would have been able to do their own internal improvements at much lower cost and without the attendant loss of confidence in their facility.

Learn about using TapRooT® proactively in our 5-Day TapRooT® Advanced Root Cause Analysis Team Leader Training.

Connect with Us

Filter News

Search News

Authors

Angie ComerAngie Comer

Software

Anne RobertsAnne Roberts

Marketing

Barb CarrBarb Carr

Editorial Director

Chris ValleeChris Vallee

Human Factors

Dan VerlindeDan Verlinde

VP, Software

Dave JanneyDave Janney

Safety & Quality

Garrett BoydGarrett Boyd

Technical Support

Ken ReedKen Reed

VP, Equifactor®

Linda UngerLinda Unger

Co-Founder

Mark ParadiesMark Paradies

Creator of TapRooT®

Michelle WishounMichelle Wishoun

Licensing Paralegal

Per OhstromPer Ohstrom

VP, Sales

Shaun BakerShaun Baker

Technical Support

Steve RaycraftSteve Raycraft

Technical Support

Wayne BrownWayne Brown

Technical Support

Success Stories

Alaska Airlines adopted System Safety and incorporated TapRooT® into the process to find the root causes…

Alaska Airlines

An improvement plan was developed and implemented. Elements of the improvement plan included process…

Exelon Nuclear
Contact Us