With the Road to Reliability™ Framework, you can reduce your downtime by 90%. And although you need all 4 Essential Elements to succeed, the majority of your downtime reduction will come through defect elimination. In this article, I will explain in detail why you need to apply root cause analysis and establish a defect elimination culture to succeed in your journey to a reliable plant.
Research done in the 1990s across a large group of manufacturing sites around the world found that reactive plants typically achieved uptimes of around 83.5%.
That same research found that the best-performing manufacturing sites achieved uptimes in excess of 98% by focusing on planning and scheduling, preventive and predictive maintenance, and defect elimination (DE).
This research, spearheaded by Winston Ledet, lead to the development of the Manufacturing Game® and multiple publications. The table below summarises the findings of this research:
Somehow, these results are not widely known in the industry—to our detriment—but worse, the table above is often misunderstood and therefore dismissed as incorrect.
The biggest finding from the research was the impact of defect elimination, which in essence is making sure you fix forever, rather than forever fixing’.
So when something fails, you make sure it does not reoccur, and over time you reduce the number of failures and increase your uptime.
What may be surprising is the big impact defect elimination has on overall uptime. This tells us that our plants are full of (hidden) defects that result in failures.
You see, we introduce defects at every stage of a plant’s life cycle: during the design, construction, and commissioning of our plants, but also during the operation and maintenance phases. If you don’t tackle these defects, they eventually lead to failures.
Maintenance doesn’t address defects, and good maintenance can only help you to achieve your plant’s inherent reliability. So, you need a defect elimination program to remove defects to achieve high reliability.
To fix forever rather than forever fix.
What are defects and where do they come from?
Winston Ledet refers to defects as “Anything that erodes value, reduces production, compromises HSE (health, safety, and environment), or creates waste.”
Defects aren’t just physical problems or equipment failures.
Defects come from equipment design issues, installation problems, workmanship issues, and even human error.
It’s also important to remember that no amount of maintenance can ever improve the inherent reliability of a piece of equipment. Those changes have to come from design modifications.
Let’s look more closely at the three main sources of defects.
You may recognise the title of this slide “The ABC’s of Failure” having seen Winston’s work in other sources online. Or you may have come across this concept in his book Don’t Just Fix It, Improve It, which, by the way, is a great book. It’s an easy-to-read fictionalised account of a plant manager trying to turn around performance. If you don’t have a copy yet, I would strongly recommend it, and at just under USD10 for an eBook edition, it is a steal. Worth every cent. (You can check it out here.)
Back to “The ABC’s of Failure”.
What Winston Ledet tells us, based on his extensive research following his time at Dupont, and later at the Manufacturing Game®, is that all failures of equipment and processes can be traced back to defects.
According to Winston Ledet, defects are the basic cause of all of our failures, and Ledet classified defects into three sources: A, B, and C.
“A” stands for Aging, which typically generates about 4% of the defects that eventually become failures over long periods of time (25–50 years).
Aging is easiest to see in the major structures and infrastructure in your plant (i.e., things like steel and concrete).
These aging defects occur even if the equipment is not operated at all.
“B” stands for Basic Wear and Tear of the equipment when we operate it.
Basic wear and tear typically generate about 12% of the defects.
These defects become failures over shorter periods of time, say from one month to seven years, depending on the quality and suitability of the design.
According to Ledet, these kinds of defects are usually the easiest to observe in the critical equipment in a plant.
This equipment usually has a long mean time between repair and is typically well-monitored and looked after.
The last category of defects is “C” which stands for Careless Work Habits.
Careless work habits contribute to the remaining 84% of the defects, and these defects become failures over random periods of time.
Careless is not the same as irresponsible.
While irresponsible habits would be included in the careless category, they are only a very small part of the total.
By careless Ledet means “not providing the care” that the equipment needs to run perfectly.
These careless habits include all our work, not just the physical operator or maintenance tasks in the field.
Careless work habits can come from the design engineer who specifies the wrong material, or the maintenance engineer who omits a critical failure mode when developing the preventive maintenance (PM) program for a plant, or the manager who does not act on deteriorating performance trends.
When you consider that these are all careless work habits, you can see why 84% of all defects come from them.
Now, the problem is that most people don’t realise that the majority of defects are created by these careless work habits, and they underestimate the significant impact that defects can have.
And worse, people refuse to accept that they have careless work habits.
I think Ledet’s choice of words is a bit unfortunate in that respect.
People are quickly offended by the word careless because they equate carelessness with irresponsibility.
Instead of focusing on improving work habits and removing defects, most organisations waste much of their time and energy on trying to prioritise a long list of failures that require repair.
Failures that should never have happened in the first place.
And too often we simply fix the issue and move on, rather than fixing it and making sure it never happens again.
And so we end up forever fixing, rather than fixing forever.
Now, you may or may not believe the results that came from Winston Ledet’s research,
but other industry sources have come to similar conclusions about the importance of quality in maintenance.
For example, studies of the US nuclear power industry and the Japanese power industry concluded that over 50% of performance problems were associated with maintenance [Reason, 1997].
Similarly, an independent Boeing study concluded that maintenance activities were a contributing factor in 80% of inflight engine shutdowns [Boeing, 1994].
Why a structured approach?
So, we have shown you that you need to eliminate defects, but why bother with a formalised, structured approach?
Aren’t most problems pretty straightforward and easily resolved?
Didn’t you say wanted to keep things simple? Why make this complicated?
Well, simply put, we don’t do it well enough without a structured process.
It’s that simple.
Let’s look at some of the common things that go wrong.
Problems are often poorly understood
It’s easy to look at a problem in terms of its symptoms. Something sounds bad. That bearing is hot. The compressor is vibrating.
And while all of these might be true, we have to keep in mind that the very framework of a problem influences the solution we’re trying to find.
Is the problem that…
…the compressor is vibrating?
…the compressor is vibrating above its high-level alarm?
…that the compressor is vibrating above its high-high trip point, resulting in a machine shutdown and plant upset?
…that the compressor has experienced three unplanned downtime events over the last two months resulting in a production loss of 5%?
When a problem arises, our natural tendency is to jump to a solution.
But it’s important to spend some time really understanding the problem first.
We tend to concentrate on technical causes
The maintenance and reliability field is—for obvious reasons—filled with technical people.
And technical people like technical solutions.
Plenty of evidence shows that at least 50% of failures have human-related causes, often because people are doing what they think is correct (training or operating philosophy),
or because they are following instructions that are wrong,
or because the instructions are inaccurate or not clear,
or because the system they are working in is flawed,
or they are under pressure and take a shortcut,
It’s important that we consider the human elements that could contribute to a failure.
Going far enough (or too far) is often a problem
Once we’ve properly framed our problem, we then want to find the ‘fixed forever’ solution.
And if we truly want to eliminate a defect, we usually have to get down to a systemic level.
So, what does that mean?
It means we need to understand if there’s an underlying system or process in place that keeps causing an issue (or if one is missing).
We don’t need to go so far that we’re dealing with things beyond our control. We need to stay within the variables we can influence.
How deep you go is going to vary from issue to issue.
But in each case, the goal is the same: we want to make sure a particular defect won’t happen again (i.e., it’s truly eliminated).
We can’t just address symptoms and expect our problems to go away.
People tend to jump to conclusions
Without a structured approach to defect elimination, people tend to skip ahead and jump to conclusions.
This can cause a few issues.
- If we fixate on what we think the solution is, we’ll have a hard time staying open-minded to what the root cause really could be. This can get in the way of problem-solving and brainstorming efforts.
- Our own internal biases can lead us to ‘pet’ solutions, which can make the process personal and cause people to feel defensive. Keep it objective and stick to the facts.
- We gloss over or ignore valuable information. If we bypass the analysis altogether, there’s a good chance we won’t find that forever fix.
- People will be focused on different stages of the analysis. This will only lead to frustration and ineffective problem-solving.
Often not enough structure
Simple problems can often be solved with an unstructured approach (symptom = cause).
But more complex problems require a more structured approach. (What is the problem? Sometimes we don’t know!)
A structured approach gives us the time and space to really analyse what we’ve observed. It allows us to clearly define the problem.
And this is important because it’s often difficult to differentiate between cause and effect.
Don’t eliminate the symptoms…
…go after the defect that produces the symptoms!
Is defect elimination the same as root cause analysis?
We often hear people say, “But we already do root cause analysis (RCA), why do we need defect elimination? Isn’t that just more of the same?”
Root cause analysis is all about preventing problems from reoccurring. In that respect, it is the same as defect elimination.
But in practice, RCA is about getting rid of the 20% of issues that cause 80% of your breakdowns, downtime, corrective maintenance, and repair costs.
When you encounter a significant breakdown, you analyse the failure, determine the root cause and resolve it. You fix it once and for all. Rinse and repeat.
So in practice root cause analysis is all about removing the big-ticket items.
It is all about dealing with your Bad Actors.
But, you also need a process for eliminating the smaller, niggly little things.
Because in just about every plant around the world, the little issues add up to a lot.
This is where defect elimination comes in. Defect elimination aims to empower your frontline and the wider support teams to independently tackle the many small issues that cause failures.
The beauty of defect elimination—when it’s done well—is that it drives you towards a reliability culture in several ways. It removes defects and makes your plant more reliable. But it is also the vehicle to engage a large part of your organisation in reliability.
To make reliability everybody’s responsibility – just like safety.
Defect elimination is a process
Defect elimination is all about eliminating small, repetitive reactive work before it happens, and in such a way that it won’t happen again.
Remember, ‘fixed forever’ instead of ‘forever fixing’.
But for DE to work, you need to have a structured process in place.
Now, this doesn’t mean you need a really formal system that is cumbersome and painful.
Quite the opposite.
You need a system that encourages your frontline employees to find and remove defects at the source.
An effective defect elimination process will include the following:
- Multidiscipline action teams: Operators and maintenance technicians as a minimum. Engineers, inspectors, planners, schedulers, and HSE reps might also be on the team.
- Training: Teams need to understand what familiar modes are, and they need to be familiar with the six failure patterns. They should also be taught the concept of mistake-proofing. Your operators should know basic condition monitoring techniques, and your maintenance techs should be trained in precision maintenance if they haven’t been already.
- Identification: Defects can be found in three main ways. First, you can use the expertise of your frontline workers to tell you what defects are out there. Second, you can analyse the failure data in your computerised maintenance management system (CMMS). Third, you can conduct more formal reliability-centered maintenance (RCM) analyses.
- Tracking: Whatever defects you find should be captured somewhere. This could be in your CMMS or a different system.
- Selection: If you want teams to take ownership of defects, let them choose which defects to work on. There can still be some rules or guidelines in place, but people who have a passion for something will be much more likely to follow it through.
- Resolution: Teams should work through a root cause analysis of some sorts (e.g., five whys, fault tree) to ensure they’re getting to the bottom of the defect.
- Authority: It’s critical that the elimination of a defect does not violate any HSE policy, so you’ll want to have some rules in place around design changes. After all, we don’t want people to introduce new HSE hazards! On the other hand, teams need to feel like they have the authority to fix issues on their own. One way to do this is by setting up a budget they can tap into without a cumbersome approval process. That said, you’ll still want some limits in place for larger sums of money (this could be in the form of a management approval).
- Reporting: It’s important to capture program successes and lessons learned. This doesn’t just mean counting the number of defects eliminated. You should also look at savings, and trend the changes in all other areas of the plant over time (reactive work, planned work as a percentage of total maintenance, safety incidents, etc.).
If you’re still not sure where to start, try the 1% rule, as coined by Winston Ledet.
Winston and his team ran a computer model, which showed that if you turn 1 out of every 100 work orders into a defect elimination order, you can reduce your work order count by 37.5% over three years. If you keep this up for eight years, the work orders are reduced by 70%. These numbers were verified using data from a refinery in Lima, Ohio (one of the plants that inspired Don’t Just Fix It, Improve It).
And remember, DE does not have to focus on just physical improvements.
Defects are mostly from careless work habits, so improving work practices can be hugely beneficial and result in big leverage (i.e., they apply to lots of equipment).
Defect elimination is a culture
If an organisation wants to successfully start and sustain a defect elimination program, it has to get buy-in from everyone.
And I mean everyone.
From the C-suite to the frontline maintenance technicians to procurement, and everyone in between.
Defect elimination needs to be thought of as everyone’s responsibility, much like safety and reliability.
A culture that supports defect elimination is also one that supports defect prevention.
It encourages employees at all levels and departments to be mindful of where and how defects could be introduced throughout the life cycle of a piece of equipment.
And it gives frontline workers the authority and confidence to tackle the small issues that not only matter to them but also to the company’s bottom line.
Defect elimination is far more than a fancy phrase or flavour of the month.
It’s a structured approach to getting at the root cause of all the niggly little problems that take up valuable plant resources. Most of these little problems stem from careless work habits and contribute to random failures.
Effective defect elimination requires a preoccupation with failures and a prevention mindset.
When used in concert with preventive and predictive maintenance, defect elimination can have a profound impact on your plant’s reliability and performance.
- Winston Ledet – Don’t Just Fix it, Improve It – Published by Reliabilityweb.com, 2013, https://www.amazon.com/Improve-Journey-Precision-Domain-Heroic-ebook
- Winston Ledet – The ABC’s of Failure – Getting Rid of the Noise in Your System, online article at: https://reliabilityweb.com/articles/entry/the_abcs_of_failure_getting_rid_of_the_noise_in_your_system
- Reason J – Managing the Risks of Organizational Accidents – Ashgate Publishing, 1997
- Boeing – Maintenance Error Decision Aid, Seattle: Boeing Commercial Airplane Group, 1994
- The Manufacturing Game at http://manufacturinggame.com/