# ACCELERATED LIFE TESTING Llovd W. Condra, Woodinville, WA 98072 #### ABSTRACT This is the second in a series of reports dealing with accelerated environmental testing. It describes accelerated life testing, why and how it is conducted, how it fits into the product design and development cycle, and how it is related to other types of accelerated tests. Accelerated life testing is one of the three basic types of accelerated tests: 1) accelerated life tests, 2) reliability enhancement tests, and 3) environmental stress screening. Later reports in this series describe reliability enhancement tests and environmental stress screening; the methods and equipment used to conduct them; and methods and models used to collect data and interpret results. #### INTRODUCTION Accelerated life testing (ALT) is one of the three basic types of accelerated reliability testing in common use today. The other two are reliability enhancement testing (RET) and environmental stress screening (ESS). ALT is used to determine the lifetime capability of components, materials, and processes used to manufacture products. Its purpose is not to expose defects, but to identify and quantify the failures and failure mechanisms which cause products to wear out at the end of their useful life. The relationship of ALT to the bathtub reliability curve, and to the product design and production cycle are shown in the first report in this series [1]. Accelerated life tests must be conducted for times long enough to represent the product lifetime, which in some cases can be quite long. To shorten the test time, the stresses (either functional or environmental) are accelerated. Accelerated life testing is the most costly and the most time-consuming of the three basic accelerated tests. It must therefore be planned and executed carefully, and the results must be applied across as wide a range of product and process applications as possible. This report discusses accelerated life tests, how to conduct them, and how to interpret and apply the results. Included in this report are discussions of the types of failure mechanisms which can be evaluated with ALT, some of the acceleration models commonly used with ALT, the information required for successful ALT, and some examples of the use of ALT. <sup>\*</sup> These tests are known variously by other names, but these three designations are used in this report series. For more complete definitions, the reader is referred to the first report in this series, HETR-94-001. #### FOUR TYPES OF FAILURE MECHANISMS Accelerated life testing is not applicable to all types of failure mechanisms. Dasgupta and Pecht [2] classified failure mechanisms into the categories described below. - 1. Stress-strength. Stress-strength failures result when the the stress placed on an item (either operating or environmental) exceeds the strength of the item. As long as the applied stresses are below its strength, there is no damage to the item, and it remains as good as it was when it was new. Examples of stress-strength failures are the yielding of a shaft due to mechanical forces, or electrical overstress of an electronic component. This type of failure usually results from random application of high stresses, or from poor design. - 2. Damage-endurance. Some stresses applied to an item in service are below the yield strength of the item, but still cause a small amount of irreversible damage. Repeated or continuous application of these stresses causes the damage to accumulate to the point where failure results. Examples of damage-endurance failures are mechanical fatigue of metals, time-dependent dielectric breakdown, and formation of weak material phases due to solid state diffusion. - 3. Challenge-response. Some flaws can exist in a product for a long time without being noticed. It is only when the product is used in a certain way (the flaw is "challenged") that they become evident. An example of a challenge-response failure is a software bug. This type of failure is quite frustrating. - 4. Tolerance-requirement. The above three failure types manifest themselves suddenly. Most of the time, the product shows no symptoms before it fails completely. By contrast, tolerance-requirement failures are characterized by a gradual degradation of performance, and the definition of failure may depend on the user's tolerance for the degree of degradation. (Anyone who has owned an old car is personally familiar with tolerance-requirement failures.) # THE FUNDAMENTAL PRINCIPLE OF ACCELERATED LIFE TESTING Accelerated life testing is based on the proposition that the device under test will exhibit the same behavior in a short time at a high stress, that it will exhibit in a longer time at a lower stress. This is illustrated by Figure 1. Most failures occur according to a distribution in time at a given stress level. A variety of distributions may be used to describe failures, but some common ones are the Weibull, lognormal, Rayleigh, and extreme value. For a comprehensive description of statistical distributions, the reader is referred to any of a number of texts on basic reliability engineering and accelerated testing. Three of the best are those by O'Connor [3], Kececioglu [4], and Nelson [5]. To obtain results of the type shown in Figure 1, it is necessary to monitor the devices under test throughout the test procedure. While this is undoubtedly the best way to conduct an accelerated life test, it is often impractical to monitor them continuously. In many cases, the items are removed from the test setup at periodic intervals for functional testing, and the number of failures are recorded. This type of testing is called interval testing, and an obvious limitation is that the precise time of failure cannot be determined. Figure 1. Plot of time-to-failure vs. stress. Time to failure ALT works best when applied to damage-endurance failures, and some tolerance-requirement failures. It is generally not applicable to stress-strength failures or challenge-response failures. Because the purpose of ALT is to evaluate lifetime in a given application, it must employ accelerated levels of the actual operating and functional stresses which the product will see in service. Some examples of such stresses are shown in Tables 1 and 2. ALT can be used as a qualification test for materials, components, manufacturing processes, subassemblies, and some end item products. Since these items may be used in a variety of applications, however, it is generally not correct to consider an item "qualified" for all applications, until the results of ALT are extrapolated to that application. Fortunately, it is possible to set up and conduct a single ALT, which provides data for a wide range of applications. # THREE BASIC ACCELERATION MODELS The purpose of accelerated life testing is to use the results of tests conducted for short times at high stress levels to predict the lifetimes of products at lower stress levels. This is done with mathematical acceleration models. The choice of an acceleration model is made on the basis of the expected failure mechanism, and on the knowledge of the stress, or combination of stresses, which will cause the failure (or failures) to occur. If more than one failure mechanism or stress condition is likely, then all of them must be evaluated. Table 1. Typical environmental conditions for automotive products. From reference [6]. | Table 1. Typical environmental conditions for diagonic products. | | | | | | | | |----------------------------------------------------------------------------------|---------------------------------------------------------|--------------------------|--------------------------|-----------------------------------------------------------|--|--|--| | Location | Temperature<br>Range, °C | Rel. Hum.<br>(% at 40°C) | Salt Spray | Vibration and<br>Shock | | | | | Under hood<br>Above exhaust<br>Intake manifold<br>Fireproof wall<br>Frontal zone | -40 to +650<br>-40 to +125<br>-40 to +140<br>-40 to +85 | 80<br>95<br>80<br>98 | Yes<br>Yes<br>Yes<br>Yes | 50g to 1KHz<br>Over 100g<br>1g to 600 Hz<br>1 g to 600 Hz | | | | | On chassis<br>Inside a wing<br>Near exhaust<br>Extreme cond. | -40 to +85<br>-40 to +125<br>-40 to +175 | 98<br>98<br>98 | Yes<br>Yes<br>Yes | 2g to 2 KHz<br>2g to 2 KHz<br>Over 100g | | | | | Car interior<br>Dashboard<br>Rear window | -40 to +120<br>-40 to +100 | 98<br>98 | No<br>No | 1g to 20 KHz<br>1g to 20 KHz | | | | Table 2. Worst-case environments for electronic products. From reference [7]. | Worst Use Environment | | | | | | | |--------------------------|--------------------------|--------------------------|------------|----------------|---------------|-----------------| | Use Category | T <sub>min</sub> ,<br>°C | T <sub>max</sub> ,<br>°C | ΔT,<br>°C | Dwell,<br>hrs. | Cyc./<br>year | Yrs. of service | | Consumer | 0 | +60 | 35 | 12 | 365 | 1-3 | | Computers | +15 | +60 | 20 | 2 | 1460 | ≈5 | | Telecommunications | -40 | +85 | 3 <b>5</b> | 12 | 365 | 7-20 | | Commercial aircraft | -55 | +95 | 20 | 2 | 3000 | ≈10 | | Industrial & automotive- | -55 | +65 | 20 | 12 | 185 | ≈10 | | passenger compartment | | | & 40 | 12 | 100 | | | | | | & 60 | 12 | 60 | | | | | | & 80 | 12 | 20 | | | Military ground & ship | -55 | +95 | 40 | 12 | 100 | ≈5 | | , | 1 | | & 60 | 12 | 265 | | | Space: low earth | -40 | +85 | 35 | 1 | 8760 | 5-20 | | geosynchronous | -40 | +85 | 35 | 12 | 365 | | | Military avionics: | -55 | +95 | 40 | 2 | 500 | ≈5 | | , | -5 <b>5</b> | +95 | 6 <b>0</b> | 2 | 500 | | | | -5 <b>5</b> | +95 | 8 <b>0</b> | 2 | 500 | | | | | | & 20 | 1 | 1000 | | | Automotive, under hood | -5 <b>5</b> | +125 | 6 <b>0</b> | 1 | 1000 | ≈5 | | | 1 | | &100 | 1 | 3000 | | | | | | &140 | 2 | 40 | | There is an almost infinite number of accleration models available, but many of them are variations of three basic models [8]: The 1) Arrhenius equation, 2) the inverse power law, and 3) the Eyring model. The Arrhenius Equation. The Arrhenius equation [9] is used to describe thermally-activated failure mechanisms, such as intermetallic diffusion, chemical reactions, and some failure mechanisms in microelectronic circuits. It can be expressed as a rate equation: $$r = r_0 e^{-\frac{E_s}{kT}} \tag{1}$$ where r is the reaction rate, $r_0$ is a constant, $E_a$ is the activation energy, usually expressed in electron-volts (1 eV=23.000 cal/mole), k is Boltzmann's constant (8.617 x 10-5 eV/K) and T is the temperature in K. The reaction rate increases exponentially with temperature, and it is thus possible to obtain extremely high acceleration factors for failure mechanisms governed by the Arrhenius equation. Consider the example of an accelerated life test when the known activation energy of the failure mechanism is 0.8 eV. If the test temperature, $T_t$ , is 125 C (398 K), the acceleration factor for a use temperature, $T_u$ , of 30 C (303 K) is $$AF = \exp\left[\frac{E_a}{k} \left(\frac{1}{T_u} - \frac{1}{T_t}\right)\right] = \exp\left[\frac{0.9}{8.617 \times 10^{-5}} \left(\frac{1}{303} - \frac{1}{398}\right)\right] \approx 1,340.$$ (2) If the median life of the test samples is 200 hours, the predicted median life of the product in service is $200 \times 1,340 = 268,000$ hours, or approximately 30 years. If the activation energy of the observed failure mechanism is not known, tests may be conducted at several temperatures, and the logarithms of the times to failure plotted against 1/T. The slope of the plotted line is equal to the activation energy, and the predicted service life may be extrapolated graphically or calculated as shown above. If the plotted line is not straight, the failure mechanism is not thermally activated, and a different acceleration model must be used. (Special Arrhenius plotting paper is available from Technology Associates, Portola Valley, CA.) The Inverse Power Law. The inverse power law applies to failure mechanisms in which the time to failure is inversely proportional to strain. Its general form is $$\tau = \frac{A}{S^n} \tag{3}$$ where $\tau$ is the time required for an event (such as failure) to occur, A is a constant, S is a strain term, and n is an exponent characteristic of the product or material being tested. The inverse power law is used to model failure mechanisms caused by stresses such as mechanical fatigue, mechanical vibration, fatigue due to thermal coefficient of expansion mismatches, and certain electrical power and voltage stresses. Under certain conditions, the strain is proportional to the applied stress, and a common method of using the inverse power law is to plot the log of stress vs. the log of time to failure, as shown in Figure 2. This is the familiar S-N curve, with the slope of the line equal to the exponent n. Figure 2. The S-N curve for inverse power law failures. Log cycles to failure If the rate of cycling of the applied stress is low, as in temperature cycling, the slope of the S-N curve is generally quite steep. If the stress cycles are rapid, such as those of mechanical fatigue or vibration, the slope is generally flatter. For most systems, there exists a stress level, called the endurance limit, below which no damage occurs, and the life of the product is essentially infinite. Application of the inverse power law can be quite complicated, especially for mechanical vibration and fatigue tests. The classical work in this area was done by Coffin and Manson [10-12]. Applications to temperature cycling for electronic equipment are relatively straightforward [13,14], but applications to vibration and high cycle mechanical fatigue are changing rapidly. Traditional applications to vibration have been based on single-axis mechanical shaker equipment, and the acceleration models were simple forms of the inverse power law such as [15] $$t_{1}G_{1}^{B} = t_{2}G_{2}^{B}$$ $$n_{1}Z_{1}^{B} = n_{2}Z_{2}^{B}$$ $$n_{1}G_{1}^{B} = n_{2}G_{2}^{B}$$ (4) where n is the number of cycles, t is the total elapsed test time, G is the acceleration level, Z is the displacement amplitude, and B is the slope of the S-N curve. Lambert [16] published a series of acceleration models for single-degree-of-freedom fatigue testing. Recent advances in vibration testing equipment and data analysis methods have resulted in more accurate reliability predictions from testing. Instead of the traditional autospectral density (ASD), measured in G<sub>rms</sub>, other measures such as shock response spectrum (SRS) and peak probability density function (PPDF) are being used to quantify the accumulated damage in service and in test [17,18]. By using the PPDF in combination with Miner's rule [19], it is possible to calculate an accumulated fatigue damage factor (AFDF) which considers all peaks of the fatigue spectrum, and not just an rms value. The AFDF is expressed as $$AFDF = \sum_{\sigma=1}^{x} \left( n_1 \sigma_1^B + \dots + n_x \sigma_x^B \right)$$ (5) where $n_x$ is the number of stress events at a given level, $\sigma_x$ is the stress level, and B is the inverse power law exponent. Instead of the traditional uniaxial shaker equipment, new equipment based on random and repetitive shock to produce vibration in six degrees of freedom is now in use. These test chambers are also capable of combining shock and vibration with other environmental stresses, such as rapid temperature cycling and humidity. The Eyring Model. The Eyring model is used to model the effects of two different stresses, one of which is temperature. Its general form is $$\tau = \frac{A}{S^n} B e^{\frac{E_a}{kT}}.$$ (6) It may be noted that this is just the product of the Arrhenius model and the inverse power law. The most common form of the Eyring model in electronics is Peck's equation for the corrosion of microelectronic circuit metallization [20]. The acceleration factor is $$AF = \left(\frac{RH_t}{RH_u}\right)^B \exp\left[\frac{E_a}{k}\left(\frac{1}{T_u} - \frac{1}{T_t}\right)\right]$$ (7) where RH is the relative humidity, T is the temperature, k is Boltzmann's constant, $E_a$ is the activation energy, equal to 0.9 eV, and B is the inverse power law exponent, equal to 3. Because this model is the product of two other models, it can produce enormous acceleration factors. For example if a test is conducted at 85 C and 85% RH (a standard test condition), and the results are used to estimate times to failure at a use condition of 40 C and 60% RH, the acceleration factor is approximately 182. # AN ACCELERATED LIFE TEST EXAMPLE Of the many excellent examples of accelerated life test in the literature, the one chosen to illustrate the method here is that of estimating the time-breakdown of an electrical insulating fluid, with the applied stress being the operating voltage [21]. It was conducted to estimate the lifetime capability of the fluid at an operating voltage of 20 kV. Groups of ten samples were tested at voltages of 26, 28, 30, 32, 34, 36, and 38 kV; and Weibull distribution plots of the times-to-breakdown were prepared. The distributions were normalized to the same Weibull slope, and the results are shown in Figure 3. The times-to-failure for 5% cumulative failure (a reliability of .95) are plotted vs. the voltage in Figure 4. It is noted in Figure 4 that the time to failure at an operating voltage of 20 kV is approximately 1600 minutes. Thus the acceleration factor between a test condition of 36 kV and a use condition of 20 kV is about 16,000. The plot in Figure 4 is a log-log plot, indicating that the acceleration model is the inverse power law, with an exponent B of approximately 16.4. In the insulating fluid example, the time to failure was defined as the time required for 5% of the samples to break down. This is the point at which the reliability of the fluid is 0.95. This choice, of course, was entirely up to the experimenters. They could have defined failure as the point at which the reliability was .99, or .90, etc. Plotting the times-to-failure on the appropriate distribution provides the opportunity to understand the results of accelerated life testing on a statistical level. #### **SUMMARY** Accelerated life testing provides valuable data about the expected wearout mechanisms of a product. This is critical in today's marketplace, since more and more customers are placing useful life requirements on products they purchase. The ability to estimate useful life is only one of the benefits of ALT. It also allows those who design and manufacture products to understand them better, so that the critical components, materials, and processes can be identified, improved and controlled; and it produces data which give both the producer and the user confidence that the product will serve its intended purpose. In this report, only the basics of accelerated life testing have been covered. Its application to specific products and stresses can employ acceleration models quite different from the basic ones presented here. Care must be taken to avoid misintepreting results. Obtaining and interpreting results can be quite challenging, but it is not beyond the ability of those who truly wish to understand and improve their products. Because of the cost and time required for accelerated life testing, it must be understood and planned carefully. Key components of the product, from a reliability point of view, must be identified; potential failure mechanisms must be known; and the stress environment of the product must be understood. Specific acceleration models must be available, or obtainable, for each failure mechanism; and the results must be interpreted properly. Many manufacturers do not have this level of understanding, or are not willing to obtain it. Those who do so have been well-rewarded, in the form of more reliable products and competitive market position. # REFERENCES - [1] Condra, L.W., Accelerated Testing for Product Reliability Assurance, Hanse Environmental Test Report HETR 94-001, Hanse Environmental, Inc., Hopkins, MI (May, 1994). - [2] Dasgupta, A., and Pecht, M., "Material Failure Mechanisms and Damage Models," *IEEE Transactions on Reliability*, vol. 40, no. 5 (1991). Pp. 531-536. - [3] O'Connor, P.D.T., Practical Reliability Engineering, John Wiley & Sons, Chichester, UK, third edition (1991). - [4] Kececioglu, D.B., Reliability Engineering Handbook, Prentice Hall. Englewood Cliffs, NJ (1991). - [5] Nelson, W., Accelerated Testing, John Wiley & Sons, New York (1990). - [6] Priore, M.G., and Farrell, J.P., Plastic Microcircuit Packages: A Technology Review, Reliability Analysis Center, 210 Mill St., Rome, NY (1992). - [7] IPC-SM-785, Guidelines for Accelerated Reliability Testing of Surface Mounted Solder Attachments, Institute for Interconnecting and Packaging Electronic Circuits (1992). - [8] Kececioglu, D., and Jacks. J., "The Eyring, Arrhenius, and Inverse Power Law and Combination Models in Accelerated Testing," *Reliability Engineering*, vol. 8 (1984). Pp. 1-9. - [9] Arrhenius, S., Z. Physik. Chem., vol. 4 (1889). - [10] Coffin. L.F., Jr., "A Study of the Effects of Cyclic Thermal Stresses on a Ductile Metal," *Transactions of the ASME*, vol. 76 (1954). Pp. 931-950. - [11] Manson, S.S., "Fatigue: A Complex Subject-Some Simple Approximations," Experimental Mechanics, vol. 5, no. 7 (1965). Pp. 193-226. - [12] Coffin, L.F., Jr., "The Effect of Frequency on the Cyclic Strain and Low Cycle Fatigue Behavior of Cast Udimet 500 at Elevated Temperature," *Metallurgical Transactions*, vol. 2 (1971). Pp. 3105-3113. - [13] Dunn, C.F., and McPherson, J.W., "Temperature Cycling Acceleration Factors for Aluminum Metallization Failure in VLSI Applications," *Proceedings of the 28th International Reliability Physics Symposium*, IEEE (1990). Pp. 252-258. - [14] Engelmaier, W., "Surface Mount Solder Joint Long Term Reliability: Design, Testing, Prediction." Soldering and Surface Mount Technology, no. 1 (1989). Pp. 14-22. - [15] Steinberg, D.S., Vibration Analysis for Electronic Equipment, John Wiley & Sons, New York (1988). - [16] Lambert, R.G., "Fatigue Life Prediction for Various Random Loading Stress Peak Distributions," Shock and Vibration Bulletin, no. 52, part 4 (1982). Pp. 1-10. - [17] Smithson, S.A., "Shock Response Spectrum Analysis for ESS and STRIFE/HALT Measurement," Proceedings of the Institute of Environmental Sciences (1991). - [18] Henderson, G.R., "Dynamic Characteristics of Repetitive Shock Machines," Proceedings of the Institute of Environmental Sciences (1993). - [19] Miner, M.A., "Cumulative Damage in Fatigue," Transactions of the American Society of Mechanical Engineers, vol. 67 (September, 1945). - [20] Peck, D.S., and Thorpe, W.R., "Highly Accelerated Stress Test History, Some Problems and Solutions," *Tutorial Notes*, 28th International Reliability Physics Symposium, IEEE (1990). Pp. 4.1-4.27. - [21] Kececioglu, D., and Li, D., "Accelerated Testing of Mechanical Equipment," Proceedings of the SAE Aerospace Technology Conference, Paper no. 861667 (October, 1986) # RELIABILITY ENHANCEMENT TESTING Llovd W. Condra, Woodinville, WA 98072 #### ABSTRACT This is the third in a series of reports dealing with accelerated environmental testing. It describes reliability enhancement testing; why and how it is conducted; how it fits into the product design and development cycle; and how it is related to other types of accelerated tests. Reliability enhancement testing is one of the three basic types of accelerated tests: 1) accelerated life tests, 2) reliability enhancement tests, and 3) environmental stress screening. Other reports in this series describe accelerated life tests and environmental stress screening; the methods and equipment used to conduct them; and methods and models used to collect data and interpret results. # INTRODUCTION Reliability enhancement testing (RET) is one of the three basic types of accelerated reliability testing in common use today. The other two are accelerated life testing (ALT) and environmental stress screening (ESS)\*. The uses and purposes of reliability enhancement testing are not widely agreed-upon; in fact, there is not even general agreement on what to call it. It is variously known as step-stress, stress-life (STRIFE) [1], highly accelerated life testing (HALT) [2], and a host of other acronyms. Just about every organization involved in reliability enhancement testing has its own acronym. The term reliability enhancement testing is suggested here as a generic designation. (The term was first used in this context by the Boeing Company.) The purpose of RET is to evaluate the reliability of a product design by the systematic application of environmental and operating stresses at progressively higher levels to precipitate failures and expose weaknesses in the design. It should thus be conducted early enough in the product design and development cycle to allow design changes to be implemented. An earlier report in this series describes the relationship of RET to the product design and development schedule, and to the bathtub reliability curve [3]. To be effective, RET must be conducted on samples which represent the design, components, materials, and manufacturing processes to be used in production. Since its purpose is to evaluate design reliability, failures due to manufacturing defects are considered irrelevant. <sup>\*</sup> These tests are known variously by other names, but these three designations are used in this report series. For more complete definitions, the reader is referred to the first report in this series. HETR-94-001. One of the main benefits of RET is its ability to provide relevant reliability data in short times. By progressively applying higher levels of stress, failures occur sooner than they would in service, or at constant accelerated test stress levels, by orders of magnitude. RET is similar to, but distinct from, design verification testing and qualification testing. A major distinction is that RET is conducted to cause failures, while failures are not desired with the other two types of test. Design verification testing (DVT) often is conducted in parallel with RET, but its purpose is to verify that the product will operate successfully over the specified range of environmental and operating conditions. DVT is usually conducted within the specification range, and yields little or no information about the long term reliability of the design. The definition of qualification testing depends on the industry in which it is used. Formal qualification testing for military products is conducted at the end of the design phase, before production can begin. Its purpose is to prove to the manufacturer and the customer that the product meets the design goals. It is usually conducted according to a predetermined test plan, including both operating and environmental stresses. When the product reaches the qualification test stage, there is little room in the schedule to make significant changes, and the goal of qualification testing is to pass with no failures. In contrast, the goal of RET is to cause failures, in order to make improvements. Because RET is a destructive test, the sample sizes are as small as possible. # TYPES OF STRESS LIMITS Many different terms have been used to describe the various stress limits of a product. Figure 1 shows some of these, and they are defined as follows: Specification limits: The limits specified by the user or the manufacturer of the product, within which the product is expected to operate. Design limits: The limits within which the product is designed to operate without failure. The difference between the specification limits and the design limits is called the design margin. Operating limits: The limits within which the product is operated during accelerated testing to quantify the effects of relevant stresses on reliability. Usually, accelerated life tests are conducted within these limits. Destruct limits: The stress limits within which the product can operate without failing irreversibly. Destruct limits can be discovered by reliability enhancement tests. ESS limits: ESS testing is done as a screen before shipping final product. ESS limits can be defined by reliability enhancement testing, and are usually within the operating limits. Setting ESS limits is not always straightforward, and it can be a challenging task. RET limits: In a sense, there are no RET limits, since they are determined as the test progresses. Obviously, however, they do not exceed the destruct limits. Figure 1. Definitions of the various stress limits of a product. # STEP STRESS TESTING Perhaps the most widely-used form of reliability enhancement test is step stress testing [4-12]. Figure 2 shows a schematic diagram of a generalized form of a step stress test. Three general stresses, labeled $S_1$ , $S_2$ , and $S_3$ are shown orthogonal to each other. These stresses might be environmental, such as temperature or vibration, or operating, such as voltage. Although all stresses which may produce failures must be considered, only three are shown here. Figure 2. Diagram showing the general approach to step stress testing. The smaller box in Figure 2 represents the specification limits, and the outer corner of the box represents the most severe stress state within those limits. The large box represents a combination of stresses beyond the specification limits. The vector T, drawn from the outer corner of the specification limits box to the outer corner of the test limits box, describes the path through which the stress combinations are increased in a stepwise fashion. Figure 3 shows the vector T in more detail. Figure 3. Detail of the vector T. In Figure 3, the units of stress are increments of the combinations selected for testing. Usually, the steps are of equal time duration, which may be as short as a few minutes, but rarely are longer than 24 hours. The different types of stresses, $S_1$ , $S_2$ , and $S_3$ , may be applied simultaneously or sequentially, or they may be applied to separate samples. The first level, or step, is usually at or below the specification limit. After its completion, the failed items are removed and analyzed. At this point, design errors or other defects may be analyzed and corrected before proceeding further. This process is repeated at successively higher stress levels until one of three things occurs: 1) all items have failed; 2) stress levels have been reached which are well beyond those required to demonstrate robust product design; or 3) irrelevant failures begin to occur as new failure mechanisms are introduced at higher stress levels. # **RET EXAMPLES** Fornall [4] reported the results of step stress testing a military pulse synthesizer assembly, with the relevant stresses being temperature, vibration, regulated DC input voltage, and output transistor drive voltage. The nominal values, specification limits, and design limits for this product are shown in Table 1. Using these values, a step stress program with ten steps between the specification limits and the design limits was developed. Results of the tests are shown in Table 2. Eight failures were observed, all of which were similar to those which had previously been observed in service. It was estimated by the investigators that these failures, which all occurred within 100 minutes of testing, would have been spread over five years in service. Using the data produce in this test, along with a minimal amount of field experience, it was possible to estimate the reliability of the product in service. The design margin of the product was also determined accurately. Table 1. Stress levels of the military pulse synthesizer assembly, from reference [4]. | Stress | Nominal | Specification<br>limit | Design limit | |------------------------------------------|----------------|------------------------|-------------------| | Temperature, °C | 45.0 | 52.0 | 125.0 | | Random vibration, G <sub>rms</sub> | 1.5g at 200 Hz | 3.0g at 80-350 Hz | 9.0g at 80-350 Hz | | DC input voltage, V <sub>dc</sub> | 10.0 | 13.0 | 18.0 | | Output transistor drive, V <sub>dc</sub> | 15.0 | 20.0 | 50.0 | Table 2. Results of the step stress test on the military pulse synthesizer assembly [4]. | Table 2. Results of the step stress test on the military pulse synthesizer assembly [4]. | | | | | | | | |------------------------------------------------------------------------------------------|----------|------------------------------|-------------------------|---------------------------|--|--|--| | Failure no. | Step no. | Failure mode | Failure<br>mechanism | Stress causing<br>failure | | | | | 1 | 5 | Inoperable | Pin connection | Vibration | | | | | 2 | 5 | Inoperable | Pin connection | Vibration | | | | | 3 | 7 | High voltage/<br>overcurrent | Abnormal osc. output | Temperature, voltage | | | | | 4 | 7 | Degraded output | Abnormal osc. output | Temperature,<br>voltage | | | | | 5 | 8 | High voltage/<br>overcurrent | Abnormal osc. output | Temperature,<br>voltage | | | | | 6 | 8 | High voltage/<br>overcurrent | Abnormal osc. output | Temperature,<br>voltage | | | | | 7 | 9 | Degraded output | Abnormal osc. output | Temperature,<br>voltage | | | | | 8 | 10 | Degraded output | Abnormal osc.<br>output | Temperature,<br>voltage | | | | Another example is the use of reliability enhancement testing to evaluate the design of printed circuit card assemblies for aerospace electronic products [13]. The two relevant stresses were considered to be temperature and vibration. They were evaluated separately and in combination. The three test sequences were the temperature cycling sequence, the vibration sequence, and the combined temperature cycling and vibration sequence. The temperature cycling sequence consisted of four steps: Step 1: -15 C to +70 C Step 2: -30 C to +85 C Step 3: -40 C to +100 C Step 4: -55 C to +115 C The ramp rates between temperature extremes was between 15 and 30 C per minute, with the dwell times at the extremes varying from 1 to 15 minutes. During the dwells, the power to the unit under test was cycled on and off. The vibration sequence consisted of applying random vibration simultaneously in three axes and three rotations (six degrees of freedom). The power spectral density was distributed over a range of 5 to 5000 Hz. The steps were at 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26 $G_{rms}$ . The time durations of the steps varied from 1 to 15 minutes. The combined temperature cycling and vibration sequence consisted of temperature cycling as described above, with the temperature limits at -60 C and +150 C throughout the sequence. While the samples were being temperature cycled, they were simultaneously exposed to vibration at levels which were increased incrementally from 2 to 26 $G_{rms}$ . The vibration conditions were identical to the process used for the vibration sequence. All failures for the avionics RET are listed in Table 3. No failures were observed during the temperature cycling sequence. In the vibration sequence, several component lead failures were observed at the higher vibration levels. Failure mechanisms in the combined sequence were identical to those observed in the vibration sequence, but in general they occurred at lower levels of vibration in the combined sequence. Table 3. Failures for the avionics circuit card RET [13]. | Failure no. | Device type | Environment | Cause of failure | |---------------------------------------|----------------|-----------------------|---------------------| | · · · · · · · · · · · · · · · · · · · | | | | | 1 | Capacitor | 28 Grms | Broken leads | | 2 | Diode | 28 Grms | Broken leads | | 3 | Diode | 28 Grms | Broken leads | | 4 | Diode | 28 Grms | Broken leads | | 5 | Diode | 18 Grms | Broken leads | | 6 | Diode | 16 Grms | Broken leads | | 7 | Diode | 18 Grms | Broken leads | | 8 | Diode | 28 Grms | Broken leads | | 9 | Diode | -60 to 150°C, 14 Grms | Broken leads | | 10 | Diode | 16 Grms | Broken leads | | 11 | Diode | 16 Grms | Broken leads | | 12 | Diode | 16 Grms | Broken leads | | 13 | Diode | 16 Grms | Broken leads | | 14 | Capacitor | 28 Grms | Broken leads | | 15 | Capacitor | 12 Grms | Bent leads | | 16 | Capacitor | -60 to 150°C, 14 Grms | Broken leads | | 17 | Capacitor | 18 Grms | Broken leads | | 18 | Capacitor | 24 Grms | Broken leads | | 19 | Capacitor | 12 Grms | Broken leads | | 20 | Capacitor | 28 Grms | Broken leads | | 21 | Microprocessor | -60 to 150°C, 8 Grms | IC out of socket | | 22 | ? | 22 Grms | Cracked solder pins | | 23 | Transistor | 16 Grms | Broken leads | Using the results of this test, the investigators were able to identify several design weaknesses in the circuit cards. As a result, they made design changes which improved the reliability of the cards. The investigators found no effect of ramp rate in precipitating failures; however, this effect has been observed elsewhere [14]. The two most common stresses in RET are temperature cycling and vibration. This study clearly shows the value of combining stresses, where possible, in reliability enhancement testing. These benefits have also been reported elsewhere [15]. Other examples of RET are described in references 6, 16, and 17, and in many internal company reports by users of RET. #### SUMMARY Reliability enhancement testing, whether known by the RET designation or any of its other names, is growing in popularity as a method of assuring product reliability. If it is properly planned and conducted in the product design and development cycle, it reduces qualification testing and reliability demonstration to mere formalities, since the manufacturers already know that the first item produced will have its "mature reliability." As a result, the customer is no longer the inspector for the first production run items. RET has been proven effective in removing field failures. In a recent study by the Boeing Company, 29% of all confirmed in-service field failures were related to failure modes found in RET [18]. RET has found its widest acceptance in industries which traditionally have not been at the forefront of reliability development, such as computers and industrial electronics. This is because the real test of reliability in these industries is customer satisfaction. As a result of success in these industries, some manufacturers in industries using traditional compliance-based methods have begun to use RET with great success. The primary benefit of reliability enhancement testing is the knowledge gained by those who design and build the product. RET is a structured way to learn about the reliability of the product, and it provides tangible information which can be used to improve it. The benefits of RET and other test-based reliability methods is best stated in a quote by David Packard: "Reliability cannot be achieved by adhering to detailed specifications. Reliability cannot be achieved by formula or analysis. Some of these may help to some extent, but there is only one road to reliability. Build it, test it, and fix the things that go wrong. Repeat the process until the desired reliability is achieved. It is a feedback process and there is no other way." ### REFERENCES - [1] Romanchik, D., "STRIFE Helps Zytec Win Baldrige," Test and Measurement World (April, 1992). Pp. 46-52. - [2] Hobbs, G.K., "Highly-Accelerated Life Tests -- HALT," Screening Technology Seminar Notes, available from Hobbs Engineering, Westminster, CO (1988). - [3] Condra. L.W., "Accelerated Testing for Product Reliability Assurance," Hanse Environmental Test Report Series, no. HETR 94-001, Hanse Environmental, Inc., Hopkins, MI (May 1994). - [4] Fornall, G.E., "Life Testing Advances Prove High Reliability C<sup>3</sup>I Systems," Signal (April, 1991). Pp. 62-69. - [5] Shaked, M., and Singpurwalla, N.D., "Inference for Step Stress Accelerated Life Tests," *Journal of Statistical Planning and Inference*, vol. 7 (1983). Pp. 295-306. - [6] Bora, J.S., "Step Stress Accelerated Life Testing of Diodes," *Microelectronics Reliability*, vol. 19, Pergamon Press Ltd. (1979). Pp. 279-280. - [7] Nelson, W., "Accelerated Life Testing--Step Stress Models and Data Analyses," *IEEE Transactions on Reliability*, vol., R-29, no. 2 (June, 1980). Pp. 103-108. - [8] Miller, R., and Nelson, W., "Optimum Simple Step Stress Plans for Accelerated Life Testing," *IEEE Transactions on Reliability*, vol. R-32, no. 1 (April, 1983). Pp. 59-65. - [9] Nelson, W., "Graphical Analysis of Accelerated Life Test Data with a Mix of Failure Modes," *IEEE Transactions on Reliability*, vol. R-24, no. 4 (October, 1975). Pp. 230-237. - [10] Bai, D.S., Kim. M.S., and Lee, S.H., Optimum Simple Step Stress Accelerated Life Tests with Censoring," *IEEE Transactions on Reliability*, vol. R-38, no. 5 (December, 1989). Pp. 528-532. - [11] Iucalano, G., and Zanini, A., "Evaluation of Failure Models Through Step Stress Tests," *IEEE Transactions on Reliability*, vol. R-35, no. 4 (October, 1986). Pp. 409-413. - [12] Allen, R.J., and Roesch, W.J., "Stringent Lifetesting Ensures Superior GaAs IC Reliability," Solid State Technology (September, 1990). Pp. 103-108. - [13] Deppe, R.W., and Minor, E.O., "Reliability Enhancement Testing (RET)," Proceedings of the Annual Reliability and Maintainability Symposium, IEEE (1994). Pp. 91-98. - [14] Smithson, S.A., "Effectiveness and Economics Yardsticks for ESS Effectiveness," Proceedings of the Institute of Environmental Sciences, IES (1990). Pp. 737-742. - [15] Seager, J.D., and Fieselman, C.D., "A Case Study on the Benefits of Combining Reliability Stress Tests," *Quality and Reliability Engineering International*, vol. 7 (1991). Pp. 181-188. - [16] Chesney, K.E., "Step Stress Analysis of a Printer." Proceedings of the Annual Reliability and Maintainability Symposium, IEEE (1986). Pp. 26-27. - [17] Tersteeg, D.J., "Reliability Testing and STRIFE Testing Is There a Correlation?" Proceedings of the Power Electronics Conference (1990). - [18] Hester, K.D., "Reliability Enhancement Testing," Notes from Boeing Reliability Enhancement Testing Symposium, Renton, WA (March 16, 1994). | | • | | |--|---|--| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # ENVIRONMENTAL STRESS SCREENING Llovd W. Condra, Woodinville, WA 98072 #### ABSTRACT This is the fourth in a series of reports dealing with accelerated environmental testing. It describes environmental stress screening, why and how it is conducted; how it fits into the product design and development cycle; and how it is related to other types of accelerated tests. Environmental stress screening is one of the three basic types of accelerated tests: 1) accelerated life tests, 2) reliability enhancement tests, and 3) environmental stress screening. Other reports in this series describe accelerated life tests and reliability enhancement tests; the methods and equipment used to conduct them, and methods and models used to collect data and interpret results. #### INTRODUCTION Environmental stress screening (ESS) is one of the three basic types of accelerated reliability testing in common use today. The other two are accelerated life testing (ALT) and reliability enhancement testing (RET)\*. ESS is one of the most widely-used of all reliability tests. Its purpose is to precipitate latent defects, which are detectable only with the application of stress. The defects are those introduced into the product during manufacturing, since design-related defects should have been detected and eliminated by reliability enhancement testing during the design phase. Figure 1 illustrates the ESS concept. ESS is effective only for a product with an infant mortality region, which is indicated by a decreasing initial failure rate in Figure 1. The optimum ESS time is $t_0$ , since at that point all the infant mortality defects have been screened out. If ESS ends before $t_0$ , the product still contains infant mortality defects which will be found by the user of the product. If ESS ends after $t_0$ , useful life is consumed without improving the failure rate. Note that the failure rate may not be zero even after $t_0$ . The failures occurring after $t_0$ are not infant mortality failures though, and they must be dealt with in other ways than ESS. There have been many attempts to prescribe standard ESS processes [1-4], but since ESS processes are product-specific, the most effective ones are based on a knowledge of the product, its potential defects, and the stresses that cause them. An effective ESS process generates valuable data which can be used to improve the product, as well as to screen out <sup>\*</sup> These tests are known by various other names, but these three designations are used in this report series. For more complete definitions, the reader is referred to the first report in this series, HETR-94-001 defects. Unfortunately, many ESS users view it only as requirement imposed by the customer or the market, and therefore do not obtain its full benefits. Figure 1. The probability density function and bathtub reliability curve for an electronics product. ESS is conducted to precipitate the infant mortality failures. The compliance-based approach to ESS treats it as a cookbook process, in which the product is exposed to a standard set of stresses, at standard levels, for standard lengths of time. Little attention is given to the failure mechanisms, or to how they are distributed in time, or to using failure data to improve the product. Compliance-based ESS provides few benefits other than that of satisfying a customer-imposed requirement. Compliance-based ESS users can incur unnecessary expense. Tables 1 and 2 show typical ESS conditions used by a manufacturer of aerospace electronics equipment. Each of these conditions was imposed by a different customer for a different product. From a physics-of-failure point of view, these conditions are practically identical, and with minor modification, they could all be conducted in a single environmental test chamber. Since the compliance-based approach does not bring this level of understanding to the process, each condition was implemented as stated, and separate test chambers were required for each one. The *physics-of-failure* approach to ESS [5-7] is based on an understanding of the potential types of latent defects in the product, the failure mechanisms, and the stresses that cause them. The ESS conditions are set up to precipitate those defects, and the data are used to determine their causes and distributions. Failure data are communicated to the appropriate design and manufacturing personnel, and used to make changes to improve the product. If it is properly set up and operated, a physics-of-failure ESS process can be extremely cost-effective. Table 1. Customer-imposed ESS conditions for circuit card assemblies for aerospace electronics products. | | Temperature cycle conditions | | | | | | | | |------|------------------------------|--------|------------|-----------|-----------|---------|--------|--------| | Part | Time, | No. of | Total | Lower | Upper | Rate, | Power | Vibra- | | no. | hrs. | cycles | hours | limit, °C | limit, °C | °C/min. | | tion | | 1 | 8 | 4 | 32 | -40 | +70 | _ | - | None | | 2 | 8 | 2 | 16 | -40 | +71 | 5 | On all | None | | 3 | 8 | 12 | 9 <b>6</b> | -55 | +85 | | Hot | None | | 4 | 8 | 12 | 9 <b>6</b> | -40 | +85 | 1 | Hot | None | | 5 | 8 | 12 | 9 <b>6</b> | -55 | +85 | | Hot | None | | 6 | 8 | 4 | 32 | -40 | +70 | 5 | Hot | None | | 7 | 8 | 4 | 32 | -40 | +70 | 5 | Hot | None | | 8 | 1 | 25 | 25 | -54 | +85 | 5 | _ | None | Table 2. Customer-imposed ESS conditions for aerospace electronic equipment. | | Temperature cycle conditions | | | | | | | | |------|------------------------------|--------|------------|-----------|-----------|---------|--------|---------| | Part | Time, | No. of | Total | Lower | Upper | Rate, | Power | Vibra- | | no. | hrs. | cycles | hours | limit, °C | limit, °C | °C/min. | | tion | | 1 | 8 | 2 | 16 | -40 | +71 | 5 | Hot | _ | | 2 | 5 | 10 | 50 | -54 | +55 | 5 | Cycled | Random | | 3 | 8 | 4 | 32 | -40 | +70 | 5 | - | - | | 4 | 4 | 12 | 48 | -40 | +70 | 5 | Cycled | Sine-1g | | 5 | 8 | 4 | 3 <b>2</b> | -40 | +71 | 5 | On all | _ | | 6 | 4 | 8 | 32 | -40 | +80 | Max | On all | - | | | | | | | | | | | #### SETTING UP THE ESS PROCESS ESS is product-unique, since each product has its own set of potential defects, and since the applied ESS stresses affect each product differently. Even though the ESS process must be set up separately for each product, there are many common features of both products and stresses which cause many ESS processes to be similar. The stresses applied in ESS are those which are expected to precipitate manufacturing defects. They are not necessarily those which the product will see in service. The two most common ESS stresses for electronic products are temperature cycling and vibration. They may be applied sequentially or simultaneously. ESS may be conducted anywhere in the manufacturing process flow. Table 3 shows some examples of the types of stresses used for ESS at the component, subassembly, assembly and system levels for electronic equipment [8]. Table 4 shows the types of defects which may be detected by temperature cycling and vibration [8]. Table 3. ESS applied at various points in the manufacturing flow. | | Stage of manufacturing process | | | | | | | | | |----------------------------------|--------------------------------|----------------------------|------------------|--|--|--|--|--|--| | Component | Subassembly | Assembly | System | | | | | | | | 1) Temperature, power | 1) Power-off temp.<br>cycling | 1) Power-on | Random vibration | | | | | | | | 2) Temperature cycling | 2) Random<br>vibration | 2) Random vibration | | | | | | | | | 3) Temperature-<br>humidity-bias | | Power-on temp. cycling | | | | | | | | Table 4. Types of failures detectable by various environments. | Thermal cycling | Vibration | Thermal or vibration | |---------------------------------|-----------------------------------------------|------------------------| | Component or parameter drift | Particle contamination | Defective solder joint | | PWB opens/shorts | Chafed or pinched wires | Loose hardware | | Component incorrectly installed | Defective components | Defective components | | Wrong component | Adjacent boards rubbing | Fasteners | | Hermetic seal failure | Two components shorting | Broken component | | Chemical contamination | Loose wire | Etching defects | | Defective wire termination | Poorly bonded component | | | Improper crimp | Mechanical flaw | | | | Inadequately secured high-<br>mass components | | The specific levels of ESS stresses are selected to precipitate the relevant defects in a relatively short time, and yet not consume a significant portion of the life of non-defective items. For electronic equipment, the lower end of the temperature cycling range is usually in the range of -40°C to -50°C, and the upper end is in the range of +75°C to +85°C. The rate of temperature change can also be important. Figure 2, from reference [7], illustrates the effects of temperature rate of change on surface mount transistor lifting. Selecting the vibration level can be quite challenging, especially if the defects are susceptible to a range of frequencies. In general, multi-axis, repetitive shock vibration is much more effective and efficient than single-axis vibration. It has also been shown that simultaneous temperature cycling and vibration is much more efficient than either separate or sequential application of the two stresses. Figure 2. Effect of temperature rate of change on surface mount transistor lifting. It is critical that electronic equipment be monitored during ESS. This is the only way to detect failures under extreme conditions. Even more important though, is that the stresses used in ESS can induce reversible damage not detected in tests conducted at ambient conditions. This induced damage is itself a latent defect, and the ESS process can actually cause early field failures. # REDUCING OR ELIMINATING ESS Since ESS is an inspection step, it does not add value to the product, and therefore should be reduced or eliminated as quickly as possible. This cannot be done without proper justification, which requires relevant data. Therefore, the ESS process must be set up so that it provides data which can be used to reduce or eliminate it. Following is an eight-step process which can be used for this purpose. - 1. Collect failure rate data during the ESS process. Failure data must be collected, not just at the beginning and the end, but during the ESS process. It is not enough to know that failures occurred; their time of occurrence must be recorded. Data from all ESS attempts, whether or not there was a failure, must be collected and recorded. - 2. Prepare a plot of failure rate vs. time. This is the type of plot shown in Figure 1. If the failure rate decreases with time, there is an opportunity to reduce it if proper product improvements can be made. If the curve is constant, or if it increases with time, the ESS process cannot be effective because either 1) there are no infant mortality defects. or 2) the wrong stresses, or levels thereof, are being used. If this is the case, ESS should either be modified or discontinued, and some other means of product improvement must be used. - 3. Analyze failures and separate them according to failure mechanism. All failures must be analyzed in order to take corrective action. It is truly amazing that many ESS operations do not include any structured method to analyze the failures, and to provide the results to those who can take the proper corrective action. - 4. Prepare plots of failure rate vs. time for each failure mechanism. After this is done, the criteria of step 2 must be applied to each individual failure mechanism. Again, only those failure mechanisms with decreasing failure rates can be attacked with ESS. - 5. Improve the product. Without using the data generated by ESS to improve the product, including design, components, materials, and processes, there is no hope of reducing or eliminating the ESS process. If those responsible for the ESS process are not those responsible for designing and manufacturing the product, it is important that good communication take place between them. - 6. Collect and analyze ESS data for the improved product. If the proper steps have been taken to improve the product, then the area under the infant mortality region of the failure rate vs. time curve should be smaller. This may result from either a reduced slope of the curve, or from a shorter time in which it reaches a constant failure rate. - 7. Modify ESS conditions to reflect the new failure rates. As failure mechanisms are eliminated, then the stresses that precipitate them may be eliminated. If they occur in shorter times, then the duration of the ESS process may be shortened. In some cases, additional stresses or increased levels may have to be introduced to detect failure mechanisms which were not expected. If this is the case, care must be taken to avoid introducing irrelevant failures. - 8. Reduce or eliminate ESS as warranted. If the ESS process has been set up properly, and if the proper data are collected and used effectively, it will result in a continuously-improving product. Eventually, a point will be reached where the ESS process may be reduced significantly or eliminated entirely. It may also be possible to reduce the frequency of ESS by going from a 100% screen to a sample screen. The justification for ESS is an economic one, and the effectiveness of ESS ultimately must be evaluated economically. This analysis is based on the cost to conduct ESS, the cost of field failures, and the frequency of occurrence of field failures [7, 9-14]. ESS costs include the cost of capital equipment, the recurring cost of conducting the process, the cost of analyzing and repairing failures, and the risk of actually introducing new failures into the product. The benefit is in the reduced costs of field failures. Unless the manufacturer has been through a process similar to that described above, ESS is usually cost-effective. #### EFFECTIVENESS OF ENVIRONMENTAL STRESS SCREENING The literature contains many examples of the successful use of ESS. Parker and Harrison [15] report that AT&T calls their process Environmental Stress Testing (EST) to emphasize the fact that they use the results to make product improvements. The AT&T process consists of a combination of temperature step stress and temperature cycling between -20°C and +70°C for circuit card assemblies. Figure 3 shows results of final test over a period of three years, during which the EST process was implemented. Figure 4 shows a plot of failures vs. number of cycles in the EST process. From the data in Figure 4, the investigators concluded that the optimum number of temperature cycles was 16. In addition to the improvement in outgoing quality, the investigators tracked field failure results. They reported a five-fold improvement in product which had been exposed to environmental stress testing, compared to product which was not exposed to EST. Although some ESS practitioners believe that the process should always be conducted on 100% of the product, the authors of reference [15] successfully implemented a sample EST process. Figure 3. Results of environmental stress testing for circuit card assemblies by AT&T. Figure 4. Plot of failures vs. temperature cycles in environmental stress testing. Chik and Devenyi [16] implemented a two stage ESS process for laser diodes. The two stages were 1) a steady state burn-in at 165°C and 10 kA/cm² for 2 hours prior to assembly, and 2) a second steady state burn-in at 70°C for 150 hours after assembly. Results showed that unscreened lasers had a medium lifetime of about 600 hours, compared to about 6000 hours for screened lasers. In another study on laser diodes, Tang, et al [17] exposed AlGaAs laser diodes to an ESS process consisting of operation under power in inert atmospheres. Their results are shown in Table 5. Again, significant improvement in operating reliability was obtained for products which had been exposed to ESS. Table 5. Laser diode lifetime improvement with ESS. | Sample | Laser dioae lij | ESS conditions | Operating power, mW | Operating lifetime, hrs. | | |--------------------------------------|--------------------------------------------------------------|-------------------------------------------------|-------------------------------------|----------------------------------------------|-----------------------------------------------------------------| | | Power, mW | Atmosphere | Time, hrs. | | | | 1<br>2<br>3<br>4<br>5<br>6<br>7<br>8 | No ESS<br>No ESS<br>No ESS<br>No ESS<br>46<br>58<br>65<br>70 | —<br>—<br>—<br>He<br>He<br>N <sub>2</sub><br>He | <br><br><br>122<br>122<br>120<br>60 | 46<br>46<br>46<br>46<br>46<br>46<br>46<br>46 | 5<br>18<br>8.5<br>20<br>240<br>13,500<br>>2,880<br>120<br>3,300 | If a product has a very low failure rate, the design and operation of the ESS process can be quite complex. McClean [18] reports the use of a technique called highly accelerated stress audit (HASA) to screen printed circuit card assemblies. The screening stresses were temperature cycling and vibration, with power being applied during the process. As the name implies, the test was applied on a sample basis. As noted from the above examples, the development and operation of an ESS process must be highly personalized to the product being screened. Perhaps the greatest benefit of ESS is the hands-on knowledge and experience about the product, gained by those who design and manufacture it. For this reason, it is not a good idea to assign the ESS process to a reliability department or a third party screening organization with limited ability to change the design or manufacturing processes. # **ALTERNATIVES TO ESS** As mentioned earlier in this report, ESS is effective only when the product has an infant mortality region. If this is not the case, other methods must be used. Some other methods which also involve the application of stresses are ongoing reliability testing (ORT), ongoing accelerated life testing, and periodic requalification. Ongoing reliability testing involves the selection of a small sample. e.g., less than 1% of production on a regular basis, and exposing the items to stresses at or slightly above the operating range for periods ranging from a few days to a few weeks. All failures are analyzed, and the data are used to improve the product. At the conclusion of the test, the surviving samples are shipped as regular product. Ongoing accelerated life testing is similar to ORT, except that the stresses are somewhat higher, and the test is continued until the samples fail. Since this is a destructive test, the sample sizes may also be somewhat smaller than those of ORT, especially if the product is an expensive one. Periodic requalification involves the repetition of the qualification procedure, or an abbreviated version thereof, on a periodic basis (usually once or twice per year). This type of test had its beginning in some of the U.S. military standards. Since periodic requalification does not involve a wide range of sample lots, and since it is expensive, it is losing popularity. #### **SUMMARY** The overall purpose of environmental stress screening is to assure that, once a product is qualified, there will be no uncontrolled variations in the individual items during the production phase. The application of stresses is necessary to detect some defects which cannot be observed by functional or visual observation. ESS must be implemented in the most effective and efficient manner possible. The only realistic way to develop and operate an effective ESS process is to use the physics-of-failure approach. This requires an understanding of the product, a knowledge of the types of defects, and of the types of stresses which precipitate them. Almost by definition, a significant amount of trial and error are associated with developing efficient ESS processes, but once the basic knowledge is gained, it can be applied to a wide range of products. In most cases where ESS has been implemented as discussed here, it has proven itself to be quite effective in reducing overall product costs. # REFERENCES - [1] MIL-STD 2164 (EC), Military Standard Environmental Stress Screening Process for Electronic Equipment. - [2] DOD-HDBK-344 (USAF), Environmental Stress Screening (ESS) of Electronic Equipment. - [3] Environmental Stress Screening (ESS) Guide, Technical Report no. AD-A206, U.S. Army, Ft. Belvoir, VA (January, 1989). - [4] Environmental Stress Screening Guidelines for Assemblies. Institute of Environmental Sciences (March, 1990). - [5] Pecht, M., and Lall, P., "A Physics of Failure Approach to Burn-in." Proceedings of the ASME Winter Annual Meeting (1993). - [6] Lambert, R.G., "Case Histories of Selection Criteria for Random Vibration Screening," *The Journal of Environmental Sciences* (January-February, 1985). Pp. 19-24. - [7] Smithson, S.A., "Effectiveness and Economics--Yardsticks for ESS Decisions," Proceedings of the Institute for Environmental Sciences (1990). - [8] Mandel C.E.N., Jr., "Environmental Stress Screening," in *Electronic Materials Handbook*, vol. 1, ASM International, Materials Park, OH (1989). Pp. 875-876. - [9] Smith. W.B., and Khory, N., "Does the Burn-in of Integrated Circuits Continue to be a Meaningful Course to Pursue?" *Proceedings of the 38th Electronic Components Conference*, IEEE (1988). Pp. 507-510. - [10] Pantic, D., "Benefits of Integrated-Circuit Burn-in to Obtain High Reliability Parts," *IEEE Transactions on Reliability*, vol. R-35, no. 1 (1986). Pp. 3-6. - [11] Shaw, M., "Recognizing the Optimum Burn-in Period," Quality and Reliability Engineering International, vol. 3 (1987). Pp. 259-263. - [12] Huston, H.H., Wood, M.H., and DePalma, V.M, "Burn-in Effectiveness Theory and Measurement," Proceedings of the International Reliability Physics Symposium, IEEE (1991). Pp. 271-276. - [13] Suydo, A., and Sy, S. "Development of a Burn-in Time Reduction Algorithm using the Principles of Acceleration Factors," *Proceedings of the International Reliability Physics Symposium*, IEEE (1991). Pp. 264-270. - [14] Trindade, D.C., "Can Burn-in Screen Wearout Mechanisms?: Reliability Modeling of Defective Sub populations A Case Study," *Proceedings of the International Reliability Physics Symposium*, IEEE (1991). Pp. 260-263. - [15] Parker, P.T., and Harrison, G.L., "Quality Improvement Using Environmental Stress Screening," AT&T Technical Journal (July-August, 1992). Pp. 10-23. - [16] Chik, K.D., and Devenyi, T.F., "The Effects of Screening on the Reliability of GaAlAs/GaAs Semiconductor Lasers," *IEEE Transactions on Electron Devices*, vol. 35, no. 7 (July, 1988). Pp. 966-969. - [17] Tang, W.C., Altendorf, E.H., Rosen, H.J., Webb, D.J., and Vettiger, P., "Lifetime Extension of Uncoated AlGaAs Single Quantum Well Lasers by High Power Burn-in in Inert Atmospheres," *Electronics Letters*, vol. 30, no. 2 (January 20, 1994). Pp. 143-145. - [18] McClean, H., "Highly Accelerated Stressing of Products with Very Low Failure Rates," Proceedings of the Institute of Environmental Sciences (1992). # USING DESIGN OF EXPERIMENTS IN ACCELERATED TESTING Llovd W. Condra. Woodinville, WA 98072 #### ABSTRACT This is the fifth in a series of reports dealing with accelerated environmental testing. It describes the use of Design of Experiments for efficient accelerated environmental testing. Previous reports in the series describe the three basic types of accelerated tests: 1) accelerated life tests, 2) reliability enhancement tests, and 3) environmental stress screening. #### INTRODUCTION Reliability evaluations, assessments, and predictions must be based on relevant and credible data. Unfortunately for most applications, reliability data collection is expensive and time-consuming. As a result, many manufacturers simply design and build their products as well as they can, and hope they will be reliable in service. In some cases, concurrent engineering methods are used to develop product designs and manufacturing processes with good quality, and sometimes reliability tests are conducted on the resulting products; but it is rare for designers to use any structured methods to design reliability into their products. Design of Experiments is such a method. Accelerated testing is a way to shorten the time required to collect reliability data, and reliability enhancement testing (RET) is a structured method to evaluate and improve designs. Even with these and other accelerated testing techniques, many manufacturers consider reliability testing to be prohibitively expensive. In this report, we consider Design of Experiments (DoE) as a cost-effective tool for concurrent product design, process development, and reliability engineering. Design of Experiments is not a new discipline, even though it has only recently been reintroduced to manufacturing in the western world. Sir Ronald Fisher developed the basic concepts for DoE in the United Kingdom in the 1920's [1]. American and British scientists continued to develop experimental design methods and theory over the next several decades, but their application was mostly limited to agriculture. In the decades after 1950, Japanese manufacturers experienced phenomenal success with DoE. The individual most responsible for this success was Dr. Genichi Taguchi [2]. Today, the two main branches of Design of Experiments are classical and Taguchi methods. Design of Experiments can take many forms and be used for many purposes, but it has two major features: 1) the simultaneous variation and evaluation of several factors, and 2) the systematic elimination of some possible factor combinations to reduce experimental time and cost. When used in combination with accelerated testing, DoE is a powerful reliability tool. The use of DoE in reliability engineering is presented in the form of examples in this The use of Design of Experiments in reliability work is developed comprehensively in reference [3]. The three examples in this report are the thermostat design experiment, the ESS process development experiment, and the glass-to-metal seal analysis experiment. # THE THERMOSTAT DESIGN EXPERIMENT Design of Experiments has been used in reliability engineering for at least 40 years [4-9]. Perhaps the most straightforward use of DoE for reliability is to select the combinations of factors, or controllable variables, which will result in the most reliable product. An example of this is the thermostat design experiment [10]. manufacturing factors were selected for evaluation at two levels each. An experimental array of factor level combinations was constructed, and samples were made according to the conditions specified by the array. The samples were then cycled on an off in a manner similar to that experienced in service, and the cycles-to-failure were recorded for each factor level combination. Statistical analyses were conducted to select the factor level combination which produced the most reliable product. Table 1 shows the factors and levels selected for the thermostat design experiment. Most of these factors are related to materials selection and manufacturing process parameters. Table 1. Factors and levels for the thermostat design experiment. | Factor | els for the thermostat design<br>Level 1 | Level 2 | |------------------------------------|------------------------------------------|-----------------------| | A - Diaphragm plating | Clean | Contaminated | | rinse<br>B - Current density | 5 minutes, 60 amps | 10 minutes, 15 amps | | C - Acid cleaning | 3 seconds | 30 seconds | | D - Diaphragm<br>electrocleaning | 2 minutes | 12 minutes | | E - Be-Cu grain size | 0.008" | 0.018" | | F - Stress orientation | Perpendicular | Parailel | | G - Humidity | Wet | Dry | | H - Heat treatment | 45 minutes, 600°F | 4 hours, 600°F | | l - Brazing | No cooling water, no flux | Excess water and flux | | J - Power element electrocleaning | Short | Long | | K - Power element<br>plating rinse | Clean | Contaminated | The twelve-run experimental array is shown in columns labeled A through K of Table 2. It is called a *fractional factorial* array, since not all possible combinations of factors are represented. Because of this, there is a risk of *confounding* some of the main effects being evaluated with interactions between them. This is an example of a screening experiment, and it is also known as a Plackett-Burman twelve-run array [11], or as a Taguchi L12 array [12]. Table 2. Experimental array for the thermostat design experiment. | Run | Α | В | С | D | Ε | F | G | Н | 1 | J | K | Mean | Std. dev. | |-----|---|---|---|---|---|---|---|---|---|---|---|------|-----------| | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4.7 | 0.999 | | 2 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2.5 | 0.115 | | 3 | 1 | 1 | 2 | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2.2 | 0.225 | | 4 | 1 | 2 | 1 | 2 | 2 | 1 | 2 | 2 | 1 | 1 | 2 | 2.4 | 0.305 | | 5 | 1 | 2 | 2 | 1 | 2 | 2 | 1 | 2 | 1 | 2 | 1 | 2.5 | 0.259 | | 6 | 1 | 2 | 2 | 2 | 1 | 2 | 2 | 1 | 2 | 1 | 1 | 4.1 | 0.883 | | 7 | 2 | 1 | 2 | 2 | 1 | 1 | 2 | 2 | 1 | 2 | 1 | 2.6 | 0.143 | | 8 | 2 | 1 | 2 | 1 | 2 | 2 | 2 | 1 | 1 | 1 | 2 | 2.1 | 0.201 | | 9 | 2 | 1 | 1 | 2 | 2 | 2 | 1 | 2 | 2 | 1 | 1 | 2.5 | 0.230 | | 10 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 2 | 2.6 | 0.090 | | 11 | 2 | 2 | 1 | 2 | 1 | 2 | 1 | 1 | 1 | 2 | 2 | 5.4 | 1.864 | | 12 | 2 | 2 | 1 | 1 | 2 | 1 | 2 | 1 | 2 | 2 | 1 | 2.2 | 0.314 | Ten sample thermostats were constructed for each of the twelve experimental runs according to the treatment combinations specified by Tables 1 and 2, and cycled on and off until failure. The test was terminated after 7,342,000 cycles, at which point all of the samples had failed in all of the runs except nos. 1, 6, and 11. The time-to-failure distributions were found to be lognormal, and the logarithms of the means and the standard deviations are shown in the rightmost two columns of Table 2. Using these results a response table was constructed, and it is shown in Table 3. A response table is a means of comparing the levels of the factors. For example, factor A was at level 1 in runs 1-6, and at level 2 in runs 7-12. The average of the means of runs 1-6 was calculated as 3.1, and the average of the means of runs 7-12 was calculated as 2.9. Therefore, since the longer life was obtained with factor A set at level 1, it is the preferred level for factor A. It may be noted that, while the levels of factor A were constant, the other factors varied systematically according to the fractional factorial array. This allows the selection of a combination of factor levels which results in a robust product design. A similar procedure was followed in selecting the optimum levels for the other factors. For factor C, the results of runs 1, 2, 4, 9, 11, and 12 (level 1) were compared with those of runs 2, 5, 6, 7, 8, and 10 (level 2). From Table 3, it may be seen that level 1 of factor C produced a longer mean life, and it was selected as the optimum level. This procedure was repeated for factors B, and D through K, and the preferred levels are indicated by asterisks in Table 3. It may be seen in Table 3 that the two factors with the greatest effect on reliability are factor E, beryllium copper grain size, and factor H, heat treatment. This is apparent from the fact that the differences between levels one and two were greater for these two factors than for the other factors. They were set at their preferred levels, which in this array is level one. The levels for all other factors were chosen as those which would result in the lowest product cost. In all cases, this was also level one. This meant that, for factors B, D, and F, the preferred level from the experimental results was rejected in favor of the level which produced the lower cost. This is a legitimate use of product and business knowledge to override the statistical conclusions. Table 3. Response table for the thermostat reliability experiment. Values shown are the logarithmic means of the data. The preferred levels for each of the factors are indicated by asterisks. | ov asierisks. | | | | | | |--------------------------------|--------|----------------------|-----------------------------------|--------|----------------------| | FACTOR | LEVEL | MEAN | FACTOR | LEVEL | MEAN | | A - Diaphragm<br>plating rinse | 1<br>2 | 3.1*<br>2.9 | G - Humidity | 1 2 | 3.3*<br>2.6 | | B - Current<br>density | 1<br>2 | 2.8<br>3.2* | H - Heat treatment | 1<br>2 | 3.4*<br>2.5 | | C - Acid cleaning | 1<br>2 | 3. <b>3</b> *<br>2.7 | I - Brazing | 1 2 | 3. <b>3</b> *<br>2.7 | | D - Diaphragm<br>electroclean | 1 2 | 2.8<br>3.2* | J - Power element<br>electroclean | 1<br>2 | 3.1*<br>2.9 | | E - Grain size | 1 2 | 3.6*<br>2.3 | K - Power element plating rinse | 1 2 | 3.1*<br>2.9 | | F - Stress<br>orientation | 1<br>2 | 2.8<br>3.2* | | | | A confirmation run was conducted with 20 samples. The factors were set at their chosen levels, and the results showed that the expected lifetime of the thermostats was over 7 million cycles, compared to only 500,000 cycles before the experiment was conducted. In addition to the lower failure rate, the product design was found to be more robust in some harsh environments, which allowed the product to penetrate new markets. The experiment also resulted in better communications between the design and manufacturing organizations, and between the beryllium copper supplier and the thermostat manufacturer. Also, since nine of the 11 factors were set at their lower-cost levels, the overall cost of the product was reduced. # USING DOE TO SET UP THE ESS PROCESS Pachuki [13] used Taguchi arrays to set up the process for environmental stress screening (ESS) for a computer CPU card. This is a double-sided, high-density circuit card with about 350 components, most of which are surface mounted. The sample size was approximately 1800 circuit cards. The goal of the experiment was to determine whether or not an ESS process could be developed, which was just as effective, but more efficient than the process then in use. The prior process consisted of 4 temperature cycles at 10°C per minute, with power applied at specified intervals. The process to be investigated included temperature cycling at 40°C per minute, in combination with vibration and power cycling. The prior process required about 3.5 hours, and the proposed process required about 40 minutes. Table 4 shows the experimental array chosen to evaluate the new ESS process. It is a Taguchi L<sub>8</sub> array, which is capable of evaluating three main effects (at two levels each) and all interactions among them. The three main effects were vibration, temperature cycling, and power cycling, and they occupy three columns of the array. The three, two-factor interactions occupy three additional columns. Instead of the three-factor interaction, a fourth main effect was chosen for the seventh column. This factor was the type of functional monitoring used during the ESS process. Because of this, some confounding is possible, but it is considered unlikely. Table 4. Taguchi $L_{\aleph}$ experimental array for the CPU card ESS experiment. | Run no. | Vib | TC | Vib x TC | PC | Vib x PC | TC x PC | Monitor | |--------------------------------------|--------------------------------------------------------|----------------------------------|-------------------------------------|----------------------------------------|--------------------------------------------|--------------------------------------------------|-------------------------------------------------| | 1<br>2<br>3<br>4<br>5<br>6<br>7<br>8 | vib<br>vib<br>vib<br>vib<br>no vib<br>no vib<br>no vib | TC TC no TC TC TC TC no TC no TC | yes<br>yes<br>no<br>no<br>no<br>yes | PC no PC no PC no PC no PC no PC no PC | yes<br>no<br>yes<br>no<br>yes<br>no<br>yes | yes<br>no<br>no<br>yes<br>yes<br>no<br>no<br>yes | funct. self self funct. self funct. funct. self | The two levels of vibration, temperature cycling, and power cycling, were of the presence absence type. In other words, one level of the factor was the presence of the stress, and the other level was its absence. The presence level for vibration was a ten-minute random vibration (20 to 2 kHz) screen at approximately 22 $G_{rms}$ . The presence level for temperature cycling was eight cycles between -20 °C and +80°C, with five minute dwells at the extremes. The temperature change rate was 60°C/ minute. The presence level for power cycling was turning a DC voltage on and of continuously at five-second intervals during the positive temperature ramps and the dwells. Two types of diagnostic methods were used to monitor the cards during the experiment. One was a PROM based self-diagnostic test, and the other was a functional test developed by Sun. The responses of the experiment were the failures that occurred, both during the ESS process, and during some aging tests conducted after ESS to simulate life. An effective ESS process, then, would be one which precipitates failures in ESS, and thereby eliminates them from the aging tests. The results of the experiment are shown in Table 5. Table 5. Results of the CPU card ESS experiment. | Run | Stress Combination | Compor | | Workmanship<br>failures | | |-----|---------------------------------------------------------------|--------|-------|-------------------------|-------| | 1 | | ESS | Aging | ESS | Aging | | 1 | Power, temperature, functional test | 3 | 5 | 2 | 0 | | 2 3 | Temperature, vibration, self test Power, vibration, self test | 0 | 4 | ō | 0 | | 4 | Vibration, functional test | - 5 | 1 | 2 | 0 | | 5 | Power, temperature, self test | 2 | 2 | 0 | 0 | | 6 7 | Temperature, functional test Power, functional test | 1 | o | 1 | o | | 8 | No stress, self test | 0 | 3 | 0 | 0 | | | Totals | 18 | 19 | 9 | 0 | The effects of the various types of stress are shown in the response tables. 6A-G. Since all sample sizes were equal (900 samples for each level) direct comparisons may be made easily. Table 6A. Effects of vibration in the CPU card ESS experiment. | Table on. Lijecis oj viole | Component failures | | Workmanshi | p failures | |----------------------------|--------------------|---------|------------|------------| | Stress level | ESS | Aging | ESS | Aging | | Vibration<br>No vibration | 14<br>4 | 14<br>5 | 8<br>1 | 0<br>0 | Table 6B. Effects of temperature cycling in the CPU card ESS experiment. | Tuble ob. Effects of temp | Component failures | | Workmanship failures | | |-----------------------------------------------|--------------------|---------|----------------------|--------| | Stress level | ESS | Aging | ESS | Aging | | Temperature cycling<br>No temperature cycling | 12<br>6 | 11<br>8 | 7<br>2 | 0<br>0 | Table 6C. Effects of the vibration-temperature cycling interaction in the CPU card ESS experiment. | experiment. | Component failures | | Workmanship failures | | | |------------------------------------|--------------------|---------|----------------------|--------|--| | Stress level | ESS | Aging | ESS | Aging | | | Vib x temp cycle<br>No interaction | 10<br>8 | 12<br>7 | 7<br>2 | 0<br>0 | | Table 6D. Effects of power cycling in the CPU card ESS experiment. | Tuble OB. Lijeels oj pon | Component failures | | Workmanship failures | | |-----------------------------------|--------------------|---------|----------------------|-------| | Stress level | ESS | Aging | ESS | Aging | | Power cycling<br>No power cycling | 6<br>12 | 11<br>8 | 3<br>6 | 0 | Table 6E. Effects of the vibration-power cycling interaction in the CPU card ESS experiment. | | Component failures | | Workmanship failures | | | |-------------------------------------|--------------------|---------|----------------------|-------|--| | Stress level | ESS | Aging | ESS | Aging | | | Vib x power cycle<br>No interaction | 4<br>14 | 12<br>7 | 2 7 | 0 | | Table 6F. Effects of the temperature cycling-power cycling interaction in the CPU card ESS experiment. | | Component failures | | Workmanship failures | | |-------------------------------------------|--------------------|---------|----------------------|--------| | Stress level | ESS | Aging | ESS | Aging | | Temp cycle x pwer cycle<br>No interaction | 1 <b>0</b><br>8 | 11<br>8 | 4 5 | 0<br>0 | Table 6G. Effects of the test method in the CPU card ESS experiment. | Tuote ed. 200 ed. e. | Component failures | | Workmanship failures | | |-------------------------------|--------------------|---------|----------------------|--------| | Test method | ESS | Aging | ESS | Aging | | Self diagnostic<br>Functional | 10<br>8 | 6<br>13 | 5<br>4 | 0<br>0 | Several conclusions may be noted from the above results: - The most effective single stress was vibration, which precipitated 14 component failures and 8 workmanship failures during the ESS process. - Temperature cycling was almost as effective as vibration, with twelve component failures and seven workmanship failures during the ESS process. - The combination of temperature cycling and vibration was also effective in precipitating failures during the ESS process. - Neither power cycling, nor its interactions with temperature cycling or vibration, appeared to have any effect in precipitating component or workmanship failures. - There was no significant difference between the two types of system test. - All of the screens were effective in eliminating workmanship failures, since none were observed in the lifetime aging test conducted after ESS. - None of the stresses were as effective in precipitating component failures as the experimenters had wished, since a total of nineteen component failures occurred in the lifetime aging test. The experimenters attributed this to the fact that the vibration step followed the temperature cycling step. The general conclusion from this experiment was that the combination of temperature cycling and vibration was found to be more effective than the prior ESS process. Since it required less time than the prior process, its use represents a cost saving. It was also shown (although not reported here) that the more rigorous combination of stresses was not harmful to non-defective products. # THE GLASS-TO-METAL SEAL STRESS ANALYSIS EXPERIMENT In a somewhat more theoretical study, Mathieu and Dasgupta used DoE to design a product so that it would be robust with respect to the stresses it would experience in service. An example of this approach is the stress analysis of glass-to-metal lead seals for hermetic packages for microcircuits [14]. The factors, or independent variables, in this investigation were the lead design features. The five such factors selected for analysis were: - S lead spacing - W lead width - H lead height - R radius of the porthole in the package, and - T wall thickness of the package. Two different dimensions, or levels, were selected for each of these factors, and combinations of them were evaluated in a sixteen-run experimental array. The two levels of each factor are called the high (H), and low (L) levels. The values of the high and low levels for each factor are shown in Table 7. Table 7. Factors and levels for the glass-to-metal seal experiment. | Tuble 7. Tub. | S | W | Н | R | Т | |---------------|-----------|-----------|-----------|-----------|-----------| | H | 0.035 in. | 0.020 in. | 0.015 in. | 0.022 in. | 0.050 in. | | L | 0.025 in. | 0.010 in. | 0.005 in. | 0.018 in | 0.030 in. | The experimental array is shown in Table 8, which shows only the five main effects columns. The complete array contains fifteen columns, which are used to evaluate both main effects and interactions. Finite element analysis was used to calculate the stresses imposed on the seal when three types of forces were applied to the leads: tension, bending, and torsion. For each of the sixteen runs of the experiment, the maximum stress conditions were calculated for each of the three types of loading. Using this approach, it was possible to compare the maximum stresses for each level of the five design factors. The experimenters combined this approach with a statistical technique called *residual analysis*, and were thus able to select the design factors which were most important in producing a robust design. | Table 8. Sixteen-run experimental array for the glass-to-metal lead seal | able 8. Sixteen-run experi | imental array to | or the glass. | -to-metal lead . | seal experiment. | |--------------------------------------------------------------------------|----------------------------|------------------|---------------|------------------|------------------| |--------------------------------------------------------------------------|----------------------------|------------------|---------------|------------------|------------------| | | | | - X | | | |---------|---|---|-----|---|---| | Run no. | s | W | Н | R | Т | | 1 | L | L | L | ٦ | L | | 2 | L | L | L | н | н | | 3 | L | L | н | L | н | | 4 | L | L | н | Н | L | | 5 | L | Н | L | L | н | | 6 | L | Н | L | Н | L | | 7 | L | Н | н | L | L | | 8 | L | Н | н | Н | н | | 9 | н | L | L | L | н | | 10 | н | L | L | н | L | | 11 | Н | L | Н | L | L | | 12 | н | L | н | н | Н | | 13 | н | н | L | L | L | | 14 | н | н | L | н | Н | | 15 | н | н | н | L | Н | | 16 | н | н | Н | н | L | The significant factors depended on the type of applied loading. For tension, the significant factors were found to be lead height (H), lead width (W), wall thickness (T), porthole radius (R), and the H x W interaction. For bending, the significant factors are H, $H^2$ , W, H x W, $H^2$ x W, $W^2$ , T, and H x T. For torsion, the significant factors are H, $H^2$ , W, H x W, $H^2$ x W, $W^2$ , H x $W^2$ , T, T x H, and $H^2$ x T. (The squared terms represent second order effects.) The results of this designed experiment allowed the investigators to express the effects of the significant design factors in closed-form relationships of the generic form: $$\frac{\sigma_1}{\sigma_1} = A_0 + \sum A_i x_i + \sum A_{ij} x_i x_j + \sum A_{ijk} x_i x_j x_k$$ where the left-hand side of the equation is the maximum principal stress for a particular type of loading, normalized by the average value for that loading. The $x_i$ terms represent values for the significant geometric variables, and the constants $A_o$ , $A_i$ , $A_{ij}$ , and $A_{ijk}$ are constants obtained from regression analysis. The use of DoE in the manner described in the glass-to-metal seal stress analysis is mathematically quite complicated. This type of usage is not common in most DoE or reliability work, but it does illustrate the range of application of DoE #### **SUMMARY** Reliability data collection is expensive and time-consuming, but it is necessary in order to obtain useful information about the reliability of a product prior to use, especially if the product is different from previous designs. Design of experiments has been shown to be effective in reducing both the cost and time required for reliability data collection. DoE is the only way to evaluate the effects of combined environments in reliability testing. Three examples, representing quite different applications of DoE to reliability work, have been presented here. # REFERENCES - [1] Fisher, R.A., The Design of Experiments, Oliver and Boyd, Edinburgh (1935, 7th edition, 1960). - [2] Taguchi, G., Introduction to Quality Engineering, American Supplier Institute, Dearborn, MI (1986). - [3] Condra, L.W., Reliability Improvement with Design of Experiments, Marcel Dekker, New York (1993). - [4] W. Weibull, "Statistical Design of Fatigue Experiments," Journal of Applied Mechanics, vol. 19, no. 1 (March, 1952). Pp. 109-113. - [5] M. Zelen, "Factorial Experiments in Life Testing," *Technometrics*, vol. 1, no. 3 (August, 1959). Pp. 269-289. - [6] G.C. Derringer, "Considerations in Single and Multiple Stress Accelerated Life Testing," *Journal of Quality Technology*, vol. 14, no. 3 (July, 1982). Pp. 130-134. - [7] D.J. Hannaman, N. Zamani, J. Dhiman, and M.G. Buehler, "Error Analysis for Optimal Design of Accelerated Tests," *Proceedings of the International Reliability Physics Symposium*, IEEE (1990). Pp. 55-60. - [8] Luvalle, M.J., "A Note on Experiment Design for Accelerated Life Tests," Microelectronics and Reliability, vol. 30, no. 3 (1990). Pp. 591-603. - [9] Specht, N., "Heat Exchanger Product Design via Taguchi Methods," Proceedings of the Third Symposium on Taguchi Methods, American Supplier Institute, Dearborn, MI (1985). Pp. 302-318. - [10] Bullington, R.G., Lovin, S., Miller, D.M., and Woodall, W., "Improvement of an Industrial Thermostat Using Designed Experiments," *Journal of Quality Technology*, vol. 25, no. 4 (October, 1993). Pp. 262-269. - [11] Plackett, R.L., and Burman, J.P., "The Design of Optimum Multifactorial Experiments," *Biometrika*, vol. 33 (1946). Pp. 305-325. - [12] Taguchi, G., and Konishi, S., Orthogonal Arrays and Linear Graphs, American Supplier Institute, Dearborn, MI (1987). - [13] Pachuki, D.E., "Environmental Stress Screening Experiment Using the Taguchi Method," *Proceedings of the Institute for Environmental Sciences* (1993). Pp. 211-219. - [14] Mathieu, B., and Dasgupta, A., "Stress Analysis of Glass-to-Metal Seals," *Proceedings of the ASME Winter Annual Meeting*, paper no. 93-WA/EEP-20, (1993).