Equipment Troubleshooting in the Future
By Natalie Tabler and Ken Reed

If you haven’t read the article by Udo Gollub on the Fourth Industrial Revolution, take some time to open this link. This article can actually be found at many links on the internet, so attribution is not 100% certain, but Mr. Gollub appears to be the probable author.

The article is interesting. It discusses a viewpoint that, in the current stage of our technological development, disruptive technologies are able to very quickly change our everyday technological expectations into “yesterday’s news.” What we consider normal today can be quickly overtaken and supplanted by new technology and paradigms. While this is an interesting viewpoint, one of the things I don’t see discussed is one of the most common problems with automating our society: equipment failure. If our world will largely depend on software controlling machinery, then we need to take a long hard look at avoiding failure not only in the manufacturing process, but also in the software development process.

The industrial revolution that brought us from an agricultural society to an industrial one also brought numerous problems along with the benefits. Changing how the work is done (computerization vs. manual labor) does not change human nature. The rush to be first to come out with a product (whether it be new software or a physical product) will remain inherent in the business equation, and with it the danger of not adequately testing, or overly optimistic expectations of benefit and refusal to admit weaknesses.

If we are talking about gaming software – no big deal. So, getting to the next level of The Legend of Zelda – Breath of the Wind had some glitches; that can be changed with the next update. But what if we are talking about self-driving cars or medical diagnostic equipment? With no human interaction with the machine (or software running it) the results could be catastrophic. And what about companies tempted to cut some corners in order to bolster profits (remember the Ford Pinto, Takata airbags, and the thousands of other recalls that cost lives)? Even ethical companies can produce defective products because of lack of knowledge or foresight. Imagine if there were little or no controls in production or end use.

Additionally, as the systems get more complex, the probability of unexpected or unrecognized error modes will also increase at a rapid rate. The Air France Flight 447 crash is a great example of this.

So what can be done to minimize these errors that will undoubtedly occur? There are really 2 options:

1. Preventative, proactive analysis safety and equipment failure prevention training will be essential as these new technologies evolve. This must also be extended to software development, since it will be the driving force in new technologies production. If you wonder how much failure prevention training is being used in this industry, just count the number of updates your computer and phone software sends out each year. And yes, failure prevention should include vigilance on security breaches. A firm understanding of human error, especially in the software and equipment design phase, is essential to understanding why an error might be introduced, and what systems we will need in place to catch or mitigate the consequences of these errors.  This obviously requires effective root cause analysis early in the process.

2. The second option is to fully analyze the results of any errors after they crop up. Since failures are harder to detect as stated in #1, it becomes even more critical that, when an error does cause a problem, we dig deep enough to fix the root cause of the failure. It will not be enough to say, “Yes, that line of code caused this issue. Corrective action: Update the line of code.” We must look more deeply into how we allowed the errant line of code to exist, and then do a rigorous generic cause analysis to see of we have this same issue elsewhere in our system.

With the potential for rapidly-evolving hardware and software systems causing errors, it will be incumbent on companies to have rigorous, effective failure analysis to prevent or minimize the effects of these errors.

Want to learn more about equipment troubleshooting? Attend our Special 2-Day Equifactor® Equipment Troubleshooting and Root Cause Analysis training February 26 and 27, 2018 in Knoxville, Tennessee and plan to stay for the 2018 Global TapRooT® Summit, February 28 to March 2, 2018.