By the early 1960s, failure data had been accumulated. Worldwide, the crash-rate was greater than 60 crashes per million takeoffs, and two-thirds of these crashes were due to equipment failure. To put this crash rate into perspective, that same crash rate in 1985 would be the equivalent of two Boeing 737s crashing somewhere in the world every day.
The increased crash rate became an issue for operations, management, government, and regulators, so action was taken in an attempt to increase equipment reliability. Consistent with the philosophy at the time—that failure was directly related to operating age (as depicted in Figure 1.6)—the overhaul and replacement intervals were shortened, thereby increasing the amount of maintenance that was performed and increasing maintenance downtime. An example of a shortened overhaul interval is depicted in Figure 1.6.
Figure 1.6 Example of a shortened overhaul interval
The new maintenance plans were put into service. After a period of time, they noticed that three things happened.
1.In very few cases things got better.
2.In very few cases things stayed the same.
3.But, for the most part things got worse.
The Federal Aviation Administration (FAA) and industry were frustrated by their inability to control the failure rate by changing the scheduled overhaul and replacement intervals. As a result, a task force was formed in the early 1960s. This team of pioneers was charged with the responsibility of obtaining a better understanding of the relationship between operating reliability and policy for overhaul and replacement.
They identified that two assumptions were embedded in the current maintenance philosophy.
Assumption 1: The likelihood of failure increases as operating age increases.
Assumption 2: It is assumed we know when those failures will occur.
The team identified that the second assumption had already been challenged. In an attempt to decrease the failure rate, the overhaul and replacement intervals were shortened, as depicted in Figure 1.6. But when the intervals were shortened, the failure rate increased. It was then identified that the first assumption—the likelihood of failure increases as operating age increases—needed to be challenged.
As a result, an enormous amount of research was performed. Electronics, hydraulics, pneumatics, engines, and structures were analyzed. What was discovered rocked the world of maintenance at the time. The research showed that there wasn’t one failure pattern that described how Failure Modes behave. In fact there are six failure patterns, as seen in Figure 1.7.
Failure patterns A, B, and C all have something in common. They exhibit an age-related failure phenomenon. Likewise, failure patterns D, E, and F have something in common. They exhibit randomness.
Figure 1.7 Six patterns of failure
What was especially shocking was the percentage of Failure Modes that conformed to each failure pattern. Figure 1.8 summarizes the percentage of Failure Modes conforming to each failure pattern.
Figure 1.8 Percentages of Failure Modes that conformed to each failure pattern
Collectively, only 11 percent of aircraft system Failure Modes behaved according to failure patterns A, B, and C, where the likelihood of failure rises with increased operating age. Failure patterns A and B have a well-defined wearout zone; it makes sense that Failure Modes conforming to these failure patterns could effectively be managed with a fixed interval overhaul or replacement. Failure patterns A, B, and C are typically associated with simple items that are subject to, for example, fatigue or wear such as tires, brake pads, and aircraft structure.
However, the remaining 89 percent of aircraft system Failure Modes occur randomly. They correspond to failure patterns D, E, and F. After the short increase in the conditional probability of failure in pattern D, as well as the infant mortality period present in failure pattern F, the Failure Mode has the same likelihood of occurring at any interval in the equipment’s expected service life. Therefore, for 89% of Failure Modes, it makes no sense to perform a fixed interval overhaul or replacement because the probability of failure is constant. These failure patterns are typically associated with complex equipment such as electronics, hydraulics, and pneumatics.
Two most notable issues
1.Only two percent of the Failure Modes conformed to failure pattern B as shown in Figure 1.9, yet this was the failure pattern that defined the way they believed equipment failure behaved!
Figure 1.9 Percentage of Failure Modes that conformed to Failure Pattern B
2.After the short increase in the conditional probability of failure in pattern D, as well as the infant mortality period present in failure pattern F, 89 percent of Failure Modes occur randomly, as depicted in Figure 1.10.
What was astonishing was that the maintenance plans in use were derived assuming nearly all Failure Modes behaved according to failure pattern B. Yet only two percent of the Failure Modes actually behaved that way. Furthermore, it was shown that most Failure Modes occur randomly. Therefore, fixed interval overhaul or replacement technically made no sense. That is, if an item is replaced today, it has the same chance of failing tomorrow as it does one year later.
Figure 1.10 Percentage of Failure Modes that conformed to Failure Patterns D, E, and F
Figure 1.11 Percentage of Failure Modes that conformed to Failure Pattern F
Figure 1.12 Reintroducing infant mortality
More important, not only were the vast majority of scheduled overhauls and replacements senseless, their efforts to control the failure rate with fixed interval overhaul and replacement were counterproductive. Their study showed that 68 percent of Failure Modes behaved according to failure pattern F, as depicted in Figure 1.11.
Infant mortality (e.g., component installed backwards, tool left behind, poor operating procedures)