Category: Equipment/Equifactor®

Root Cause Analysis Tip: Best Practice Sharing #2 – TapRooT® Summit

April 25th, 2012 by

In today’s Root Cause Analysis Tip, Phil Goodman shares his TapRooT® best practice at our 2012 Global TapRooT® Summit.

Today is Part 2 of 12. Click here for Part 1.

Next week, hear Jeff Cooper of Boart Longyear share his TapRooT® best practice.

Time for Equifactor®? Maybe Past Time!

January 25th, 2012 by

Image001-4

Here the text that came with the picture ,,, don’t know if it is true …

Here are some photos of what happens when bearings overheat
in the transmissions of these monster windmills.

To date no gear oil  has been invented to withstand the pressures produced within these transmissions.

Most recently, the government gave Dow-Corning a big  grant to work on it.

Previously, many others had tried and failed.

As they age there will be many  more bearing failures.

2-1

Image002-2

Hard to believe that every wind turbine will fail due to inadequate gear lubrication.

I had heard that many wind turbines are not getting proper maintenance.

Wonder what Equifactor® has to say about this?

Investigation of Fatal Elevator Accident in New York Continues – Maintenance Work May Be the "Cause"

January 24th, 2012 by

The New York Times reported that Robert LiMandri, the Commissioner of the Buildings Department in New York City, said:

We know that there was work being done right before the unfortunate event, and we do believe that is a contributing cause, or the cause.

He also said:

We know for sure that those events directly before this unfortunate accident clearly are part of our investigation.

Suzanne Hart was killed while when the elevator suddenly shot upwards as she boarded.

The story also says that the about 60,000 elevators in New York produced 53 accident in the previous year.

Great Human Factors: Wrong Tools, Bad Access by Design, Per “Ingenuity” or All of the Above?

January 19th, 2012 by

As an ex-aircraft mechanic and a “sometimes gotta work on my own car” mechanic, I have in the past borrowed or made some of the tools pictured below. The questions remain:

Wrong Tool?

Bad Access by Design?

Mechanic’s Ingenuity?

Or a little bit of them all?

Finally, ever have one of your modified tools bite you back?  Share your stories in the comment section.

cone-wrench-mod

DSC08955

Oil Cooler Line Wrench #2 009 (Medium)

Drinking Water Emergency at Point Hope Caused by Pump Impeller Problems

December 27th, 2011 by

How can bad equipment reliability cause a crisis? Imagine losing the water supply at your house or business for an extended period.

It seems that all five impellers on their five pumps failed due to corrosion on pumps at the Point Hope, CA, water plant.

The previous impellers lasted lasted 67 years without failure but the new pumps at a new plant commissioned in 2005 only made it until 2011. The first impeller inspection wasn’t even scheduled until 20012.

For complete details, see these stories:

http://www.northumberlandtoday.com/ArticleDisplay.aspx?e=3414004

http://www.northumberlandnews.com/news/article/1269311–upkeep-issues-ruled-out-as-port-hope-water-emergency-cause

And if you want to learn more about troubleshooting pump problems, attend the TapRooT®/Equifactor® Equipment Troubleshooting and Root Cause Analysis Course. CLICK HERE to see the public course schedule for 2012.

Monday Accident & Lessons Learned: Make Sure You Remove the Grounding Strap Before You Energize the Switchgear!

November 21st, 2011 by

Pictures sent to me by a TapRooT® User of an unfortunate accident …

Screen Shot 2011-11-04 At 6.24.47 Pm

Screen Shot 2011-11-04 At 6.25.46 Pm

Screen Shot 2011-11-04 At 6.26.19 Pm

Screen Shot 2011-11-04 At 6.27.00 Pm

Screen Shot 2011-11-04 At 6.27.35 Pm

Screen Shot 2011-11-04 At 6.28.14 Pm

Screen Shot 2011-11-04 At 6.28.58 Pm

Monday Accident & Lessons Learned: Bad Maintenance Practices Lead to Failed Train Wheel Set and Derailment

October 31st, 2011 by

Do your maintenance folks “make it work”?

Screen Shot 2011-10-06 At 3.20.41 Pm

Looks like “just make it work” was a cause of this accident.

See the accident report from the UK Rail Accident Investigation Branch:

http://www.raib.gov.uk/cms_resources.cfm?file=/Bulletin%20(Bure%20Valley%20Railway)%2004-2011.pdf

Blackberry Outage – Is a Three Day Outage on a High Reliability Business Application OK?

October 13th, 2011 by

Many people count on their Blackberries to run their business. They get concerned about even a one hour outage. But the most recent outage has been going on for three days.

Here’s a quote from a recent Forbes story about the unexpected outage:

In a Wednesday afternoon conference call for reporters, RIM’s Chief Technology Officer for software, David Yach, said the company is working “around the clock” to fix the service issues. Though RIM says it is still investigating the root cause of the problem, Yach expressed certainty that the global outage stemmed from the failure of a single “core switch” in Europe and was not the result of a network breach or hack. Since RIM provides back-end service support for all BlackBerrys, the company operates multiple nodes and switches around the world for routing data.

This failure caused a backlog that overwhelmed the system.

Does this sound like reliability issues you face?

Could they have avoided this issues with some proactive application of root cause analysis?

We’ll watch what comes out in future press reports.

Lightning NOT the Root Cause of Amazon Data Center Outage

August 17th, 2011 by

The Inquirer published this article:

Lightning did not cause Amazon datacentre outage

Interesting to see the root cause analysis of a computer reliability problem being discussed.

First, we could argue if “lightning” could be a root cause. But let’s save that argument for some other time.

But what I found interesting in this article was that they were eliminating a potential cause and then going on to look further.

Looks like it is a power supply reliability root cause analysis. The first step in this process is evidence collection and troubleshooting of the “cause” of the failure.

Since they don’t know the reason that the transformer exploded, finding a root cause is going to be difficult.

It would be interesting to see the process used in this engineering analysis that is in the start of the evidence collection and evaluation process that contributes to the root cause analysis.

Next, the article goes on to discuss problems with the load transferring to backup diesel generators. This would be a second causal factor that needs to be analyzed (troubleshooting and root cause analysis).

The approach for corrective action was mentioned in the article:

– more redundancy and more isolation to its PLCs, in order to prevent failures from spreading,
– a new “environmentally friendly” backup PLC
– improved load balancing
– drastically shorter recovery times

All this will be accomplished “… as soon as possible.”

Of course these corrective actions aren’t very specific (they would not meet the SMARTER criteria in TapRooT®) but they are just a list out of an article. Perhaps the company corrective actions are more detailed.

Also, it is interesting to see additional safeguards being suggested before the failure of the current safeguards are understood.

For cloud computer users, let’s hope a successful root cause analysis with effective corrective action is completed so that future outages can be minimized.

37 Bodies Recovered from Two Mine Accidents in the Ukraine

August 1st, 2011 by

The Associated Press reported that the bodies of all the miners killed in two mine accidents in the Ukraine had been recovered.

One accident was related to an explosion of methane gas (26 killed) and the other was related to the failure of an elevator (11 killed).

Press Release from the UK Rail Accident Investigation Branch: Investigation into the derailment of a Bure Valley Railway passenger train near Brampton, Aylsham, Norfolk, 30 May 2011

June 16th, 2011 by

 Cms Resources Bure-Valley
Image of incursion of the derailed bogie into the passenger compartment
(by courtesy of the Bure Valley Railway)

The RAIB is investigating an accident that occurred when the 14:40 hrs passenger train from Wroxham to Aylsham derailed close to the village of Brampton, near Aylsham, in Norfolk.  The train was running on the Bure Valley Railway, a tourist railway, with a track gauge of 350 mm (15 inches).  The train consisted of seven coaches and a brake van and was hauled by a steam locomotive.  It was staffed by two crew members and was carrying 61 passengers.

The train is believed to have been travelling at about 16 mph (26 km/h) when the end of an axle under the second coach fractured, derailing its leading bogie.  During the derailment the other, undamaged, wheelset, fitted to the derailed bogie, forced its way through the wooden floor of the coach into a passenger compartment.

There were no reported injuries as a result of the accident and most of the passengers walked through to Aylsham to complete their journeys.  The remainder were transported by road.

The RAIB’s preliminary examination of the site confirmed that axle failure was the cause of the derailment.  There was no evidence that the maintenance of the track, or the operation of the train were factors contributing to the accident.

The RAIB’s further investigation activities will focus on the failure of the axle and will be independent of any investigation by the safety authority (the Office of Rail Regulation).

Unexpected Power Failure Costs RackSpace $3.5 Million in Refunds

June 15th, 2011 by

When the reliability of your “cloud” depends on a server farm’s power, a power outage can be a major incident.

If you are a web hosting service, unreliability can cost you customers. To try to keep your customers, you give refunds when a service outage happens. RackSpace announced in an SED filing that it will pay $3.5 million in refunds (service credits) due to a recent loss of service after a power problem and failure of back-up power.

So even in the “cloud”, equipment and power reliability are important.

What can we learn? That root cause analysis is important in all sorts of industries. Repeat problems (this isn’t the first power reliability issue) cause unhappy customers. Better to solve reliability problems the right way by addressing their root causes.

How Much Does an Accident Cost? The Fine Was £150,000 …

May 24th, 2011 by

The UK Health & Safety Executive posted a press release about a chemical plant in Rye, East Sussex, UK, that was fined £150,000 after a spill of waste solvents.

The initial tank failure that started the release was caused by internal corrosion of the tank.

For more information, see:

http://www.hse.gov.uk/press/2011/coi-se-2005.htm

Wind Turbine Accident: Installation? Maintenance? One of a Kind Accident?

April 12th, 2011 by

Saw an interesting AP article about a wind turbine accident in North Dakota. The rotor and blades of a wind turbine had crashed to the ground.

Scott Winneguth, Director of Engineering for Iberdrola Renewables said the accident was “very out of the ordinary … a singular event.”

He also said:

I can assure you, for the near term, that we will check for bolt integrity and misalignment on a much more frequent basis than our normal activity would entail”

The first statement and the second statement make no sense when taken together. If this was a one off event, why change their standard practices? Also, why change the standard practices for just a short period of time?

The article also said:

Winneguth said the 70 turbines in the Rugby project were subsequently inspected and each of their 3,360 bolts checked. Seven bolts on four of the turbines were replaced as a precaution.

Seven more bolts replaced???

Duncan Koerbel, an executive for the turbine’s manufacturer, Suzlon Wind Energy Corp, said that the cause of the misalignment was “not known” and he “was not sure” how long the problem took to develop.

Does this sound like the need better troubleshooting and root cause analysis? It does to me!

If you need better equipment troubleshooting and root cause analysis, consider attending the 3-Day TapRooT®/Equifactor® Equipment Troubleshooting and Root Cause Analysis Course. We have courses coming up in:

Doha, Qatar
New Orleans
Edmonton, AB
Birmingham, UK
Knoxville, TN
Brisbane, Australia
Midland, TX

For more information and course dates, see:

http://www.taproot.com/courses.php?d=3

Animal Causes Equipment Failure

February 9th, 2011 by

I ran across this today, concerning a power outage:  Link

One line in particular caught my eye:
“An animal tampered with a switch which caused an equipment failure and power outage in the City Center area of Newport News.”

Not a lot of info, but it shows how easy it is to lay blame on anything that can’t answer back.  I doubt an animal actually “tampered” (implies intent) with anything.  Then, after we blame the animal, we then shift to blaming the equipment.

Once you start learning to ask the right questions, you start seeing that there are a lot of other directions this investigation could go:
– Accessibility of wildlife to critical equipment
– Fault tolerance of critical systems
– Recovery efforts

CSB Releases Video On the Dangers of Purging with Flammable Gasses

February 4th, 2011 by

The CSB has put together a great video documenting the dangers of venting flammable gasses into enclosed spaces.  This practice is widespread throughout many industries for cleaning natural gas piping systems or purging them of air.  Take a look at it here.

As you watch the video, you may think (like I did), “I’m sure that this only happens in very specific industries.  I’m sure my guys don’t do that.”  And yet, think about how your company does handle these types of circumstances:
– When you install a new gas water heater, how do you expect your workers are venting the system? 
– When a new gas stove is installed in your cafeteria, what provisions have been made to purge the piping?
– When you change over from an oil-fired boiler to a gas-fired boiler, what special tools and procedures do you have in place to restore piping cleanliness?

Although the recommendations of the CSB seem common-sense, it becomes a little more involved when you actually have to implement these recommendations.  Make sure you have made the proper preparations for any maintenance when flammable gasses are involved.  “We’ve always done it this way” is not necessarily the right answer!

Mechanical Failure as the Root Cause of a Helicopter Crash?

February 2nd, 2011 by

I saw this entry in a law office blog Link.

All kinds of assumptions and corrective actions already being bandied about.  Since the pilot had many years of experience, they are going to focus on mechanical failure.

“At this time, the circumstances of the crash remain mysterious as investigators analyze the crash scene and other information. Because the pilot was so experienced, the investigation may focus not on pilot error but rather for evidence of a mechanical failure. An equipment problem is often the culprit in Nevada airplane crashes and helicopter accidents, especially when the pilot has vast flying experience. Failure to continue regular maintenance is a frequent cause of a crash for low-budget skydiving or aerial tour companies.”

Already, this article starts looking at who to blame, who is at fault, etc.  No facts have been established, beyond the fact that an experienced pilot crashed.  It is always important to look at the purpose of an article when you start reading it.  This blog article is on a personal injury law firm’s website, so the purpose of the article is pretty clear.  Some website articles may force you to dig a little to discover why the article was written.  Take the time to put the story into the context of the writer to ensure you are getting accurate, unbiased information.

How much should you spend on equipment reliability?

January 17th, 2011 by


Equipment reliability has several facets that need to be blended to make the business case for your required level of equipment reliability.  One of those items is the environment under which the equipment will be operated.  Another is the accessibility of that equipment.  Here’s an example of determining the correct amount of monitoring in order to ensure the minimum required reliability:
Link

Trans-Alaska Oil Pipeline Shut Down Due To Oil Leak

January 17th, 2011 by


The Trans-Alaska Pipeline was shut down to almost a trickle on January 8th due to a leak at Pumping Station #1 on the North Slope. 
Read more here
Initial estimates indicate that approximately 10 barrels of oil leaked into the basement of the pumping station, all of which has been cleaned up.  They hope to resume pumping shortly.  No reports yet as to root causes of the failure.

This is news for several reasons:

– Shut down of a major oil supply source to the U.S.
– Possibility of environmental problems
– Possible rise in oil prices

And yet, most of the articles concentrate on the fact that this is a BP pipeline, even though the pipeline is jointly owned by BP, ConocoPhillips, and ExxonMobil.  Can your company survive this kind of publicity?  BP survives because it is so HUGE, and their product is a necessity.  Equipment failures that lead to major public issues can become a permanent black mark against your company name.  I don’t think most companies would be able to weather these types of continuous problems.

 

Low equipment reliability plagues the V-22 Osprey

January 14th, 2011 by

In a budget-conscious environment, you don’t want poor reliability to be the hallmark of your program.  Link

Connect with Us

Filter News

Search News

Authors

Angie ComerAngie Comer

Software

Barb CarrBarb Carr

Editorial Director

Chris ValleeChris Vallee

Human Factors

Dan VerlindeDan Verlinde

VP, Software

Dave JanneyDave Janney

Safety & Quality

Garrett BoydGarrett Boyd

Technical Support

Ken ReedKen Reed

VP, Equifactor®

Linda UngerLinda Unger

Co-Founder

Mark ParadiesMark Paradies

Creator of TapRooT®

Per OhstromPer Ohstrom

VP, Sales

Shaun BakerShaun Baker

Technical Support

Steve RaycraftSteve Raycraft

Technical Support

Susan Napier-SewellSusan Napier-Sewell

Marketing & Communications Strategist

Wayne BrownWayne Brown

Technical Support

Success Stories

Alaska Airlines adopted System Safety and incorporated TapRooT® into the process to find the root causes…

Alaska Airlines

Our Acrylates Area Oxidation Reactor was experiencing frequent unplanned shutdowns (trips) that…

Rohm & Haas
Contact Us