18.09.2013 Views

Electronic Part Failure Analysis Tools and Techniques Walter Willing ...

Electronic Part Failure Analysis Tools and Techniques Walter Willing ...

Electronic Part Failure Analysis Tools and Techniques Walter Willing ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2012 Annual RELIABILITY <strong>and</strong> MAINTAINABILITY Symposium<br />

<strong>Electronic</strong> <strong>Part</strong> <strong>Failure</strong> <strong>Analysis</strong> <strong>Tools</strong> <strong>and</strong> <strong>Techniques</strong><br />

<strong>Walter</strong> <strong>Willing</strong>, Jonathan Fleisher & Michael Cascio<br />

<strong>Walter</strong> <strong>Willing</strong>, Jonathan Fleisher & Michael Cascio<br />

Northrop Grumman Corporation<br />

7323 Aviation Blvd,<br />

Baltimore, MD, 21090, USA<br />

e-mail: walter.willing@ngc.com, jonathan.fleisher@ngc.com & michael.cascio@ngc.com<br />

Tutorial Notes © 2012 AR&MS


SUMMARY & PURPOSE<br />

The current emphasis on Physics of <strong>Failure</strong> (PoF) <strong>and</strong> accurate Root Cause <strong>Analysis</strong> (RCA) highlights the need for<br />

effective electronic part failure analysis processes <strong>and</strong> capabilities. <strong>Failure</strong> analysis can be as simple as visually inspecting a<br />

part <strong>and</strong> as extensive as performing sub-micron level cross-sectioning of silicon die using Focus Ion Beam (FIB) technology.<br />

This tutorial presents a “Process” as well as the tools <strong>and</strong> techniques required to perform effective failure analyses on electronic<br />

components. In addition, the common failure mechanisms found in electronic hardware are explained <strong>and</strong> emphasized with a<br />

case study.<br />

<strong>Walter</strong> <strong>Willing</strong><br />

Mr. <strong>Willing</strong> is a Senior Advisory Reliability Engineer within the Northrop Grumman Corporation <strong>Electronic</strong> Systems<br />

Sector, System Supportability Engineering Department. Mr. <strong>Willing</strong> has over 30 years experience in space systems reliability.<br />

He received a BSEE from the University of Delaware <strong>and</strong> an MSEE from the Loyola College of Maryl<strong>and</strong>. He is active in the<br />

IEEE (Sr. Member, Vice Chairman of the Baltimore Section), IEST <strong>and</strong> serves on the RAMS Management Committee. He has<br />

authored five peer reviewed technical papers <strong>and</strong> one RADC publication.<br />

Jonathan Fleisher<br />

Mr. Fleisher is a Principal Reliability Engineer within the Northrop Grumman Corporation <strong>Electronic</strong> Systems Sector,<br />

System Supportability Engineering Department. Mr. Fleisher received a BSME <strong>and</strong> an MSIE from New Mexico State<br />

University. He has 16 years of engineering experience on a variety of defense related programs, with multiple Systems<br />

Engineering responsibilities, including Environmental Qualification Lead on several radar programs. During the last several<br />

years, he has focused on reliability engineering for NGC Space Programs.<br />

Michael Cascio<br />

Mr. Cascio is a <strong>Failure</strong> <strong>Analysis</strong> <strong>and</strong> Reliability Engineer within the Product Integrity Department of the Northrop<br />

Grumman <strong>Electronic</strong>s Systems Sector in Baltimore Maryl<strong>and</strong>. Mr. Cascio received a BSEE from The Pennsylvania State<br />

University. He has over 20 years of electronic experience in Radar, Reliability <strong>and</strong> <strong>Failure</strong> <strong>Analysis</strong>. He spent eleven years in<br />

the United States Air Force where he managed operations, maintenance <strong>and</strong> support equipment for 20 two million dollar radars.<br />

He also directed the research <strong>and</strong> development upgrades on the enhancement of radar systems. At Northrop Grumman he has<br />

10 years of engineering experience in <strong>Failure</strong> <strong>Analysis</strong> <strong>and</strong> Reliability.<br />

Table of Contents<br />

1. Introduction .......................................................................................................................................................................... 1<br />

2. Importance of Effective <strong>Failure</strong> <strong>Analysis</strong> ............................................................................................................................. 1<br />

3. Basic <strong>Failure</strong> <strong>Analysis</strong> <strong>Techniques</strong> ...................................................................................................................................... 2<br />

4. Suggestions for Your Own <strong>Failure</strong> <strong>Analysis</strong> Capabilities .................................................................................................... 7<br />

5. Underst<strong>and</strong>ing <strong>Electronic</strong> <strong>Part</strong> <strong>Failure</strong> Mechanisms ............................................................................................................ 7<br />

6. <strong>Failure</strong> <strong>Analysis</strong> Case Study ............................................................................................................................................... 11<br />

7. Conclusions ........................................................................................................................................................................ 12<br />

8. References .......................................................................................................................................................................... 12<br />

9. Tutorial Visuals…………………………………………………………………………………….. ................................. 13<br />

ii – <strong>Willing</strong>, Fleisher & Cascio 2012 AR&MS Tutorial Notes


1. INTRODUCTION<br />

Organizations that produce electronic hardware should<br />

have some level of electronic part failure analysis capability<br />

<strong>and</strong> knowledge of where to go for extended failure analysis.<br />

The failure analysis process is also important. First, it is<br />

important to verify <strong>and</strong> characterize the failure via electrical<br />

test. Subsequent steps should involve non-invasive<br />

examinations such as microscopic visual inspection, X-ray<br />

<strong>and</strong> hermetic seal tests. Finally, after all non-invasive tests are<br />

completed, devices can be de-lidded (or de-capsulated) <strong>and</strong><br />

silicon die inspections <strong>and</strong> evaluations can be performed.<br />

This tutorial discusses the fundamental electronic part<br />

failure analysis processes, methods, tools <strong>and</strong> techniques that<br />

can be utilized to accurately determine why devices fail. This<br />

tutorial is an expansion of the 1997 O.A. Plait award winning<br />

tutorial “Underst<strong>and</strong>ing <strong>Electronic</strong> <strong>Part</strong> <strong>Failure</strong> Mechanisms”,<br />

sections of which are repeated in this tutorial (refer to Section<br />

5). It is important to know what the common part failure<br />

modes are as well as the failure analysis techniques used to<br />

find them.<br />

Underst<strong>and</strong>ing the cause of the part failure allows for<br />

effective corrective action <strong>and</strong> the prevention of future<br />

occurrences. Suggestions for several levels of failure analyses<br />

capabilities will be presented (Basic, Moderate, Advanced) as<br />

well as some examples of actual failure analyses to illustrate<br />

what actually occurs in failed hardware.<br />

2. IMPORTANCE OF EFFECTIVE FAILURE ANALYSIS<br />

When electronic parts fail, it’s important to underst<strong>and</strong><br />

why they failed. Effective root cause analysis of part failures<br />

is required to assure proper corrective action can be<br />

implemented to prevent reoccurrence. Determination of root<br />

cause is also important for High Reliability systems such as<br />

implantable medical devices, space satellite systems, deep<br />

well drilling systems, etc, where failures are critical, as well as<br />

consumer products where the cost of a single failure mode can<br />

be replicated multiple times.<br />

A common term for the process of root cause<br />

determination <strong>and</strong> applying corrective action is called<br />

FRACAS (<strong>Failure</strong> Reporting, <strong>Analysis</strong> <strong>and</strong> Corrective Action<br />

System). <strong>Failure</strong> <strong>Analysis</strong> is the crucial part of the FRACAS<br />

process.<br />

<strong>Failure</strong> <strong>Analysis</strong> must be performed correctly to assure<br />

the failure mechanism is preserved, not “Lost” due to<br />

carelessness, bypassing critical measurements or performing<br />

destructive analyses in an incorrect sequence. For example,<br />

once wirebonds are removed, the part may not be able to be<br />

electrically tested. Furthermore, parts removed for failure<br />

analysis may “Re-Test OK” (RTOK) as a result of the wrong<br />

part being removed, or the fact that testing does not properly<br />

capture the part’s failure mode (such as a subtle parameter<br />

shift) or a particular failure sensitivity (gain vs temperature)<br />

exists.<br />

Since it is important to preserve <strong>and</strong> characterize the<br />

failure mode to the greatest extent possible, this tutorial<br />

presents a suggested failure analysis flow, starting with full<br />

part failure characterization, followed by non-invasive <strong>and</strong><br />

finally invasive failure analysis techniques.<br />

The following sections herein address basic failure<br />

analysis techniques. Additional information on failure<br />

analysis methods can be found in Mil-Std-883 <strong>and</strong> Mil-Std-<br />

1580. While these specifications define test <strong>and</strong> evaluation<br />

methods, the “requirements” <strong>and</strong> methods within these<br />

st<strong>and</strong>ards provide a good baseline for evaluating failed parts.<br />

For example, when evaluating the wirebonds on a failed part,<br />

the pull test limits in Mil-Std-883 (Method 2011) can provide<br />

insight as to whether the failure part has good wirebonds. The<br />

internal visual inspection criteria of Mil-Std-883 (Methods<br />

2010 <strong>and</strong> 2017) help determine whether any anomalies are<br />

actually defects or allowed process variations.<br />

For further investigation into advanced failure analysis<br />

techniques <strong>and</strong> component failure modes, the reader is<br />

encouraged to become familiar with the International<br />

Reliability Physics Symposium (IRPS) as well as other<br />

venues.<br />

The following are some top causes for component failures<br />

experienced on various types of electronic equipment:<br />

1) Electrical Overstress: During board level testing, it’s quite<br />

common to experience electrical overstress due to<br />

transients related to test setups. All power inputs to<br />

electronic assemblies should be properly controlled to<br />

protect against fault conditions <strong>and</strong> unattended transients.<br />

Inadvertent connections or rapid switching to full<br />

amplitude voltage levels can lead to inrush or high<br />

transient conditions that can damage components.<br />

Human body electrical static discharge (ESD) overstress<br />

is also a well-known <strong>and</strong> documented mechanism that<br />

damages components. ESD sensitive integrated circuits<br />

(IC) are the most commonly affected. ICs rated below<br />

250V for ESD are easily damaged by human h<strong>and</strong>ling<br />

without adequate ESD controls.<br />

2) Contamination: One of the more common causes of latent<br />

failure is due to contamination. Contamination<br />

ultimately leads to failures stemming from corrosion or<br />

degradation related to active elements such as<br />

semiconductors. Contamination can also rapidly destroy<br />

wire bond interconnects <strong>and</strong> metallization. Sources of<br />

contamination can typically be traced to either human byproducts<br />

(Spittle) or chemicals used in the assembly<br />

process.<br />

3) Solder joint failure: Solder joint workmanship is the most<br />

common issue related to initial assembly or board<br />

fabrication. It is also commonly responsible for latent<br />

failures due to joint fatigue driven by thermal cycling.<br />

Non compliant or leadless ceramic type components of<br />

>0.25inch size are the parts that are most susceptible to<br />

solder joint wear out failures. Examples of solder joint<br />

failures are shown in Figure 1.<br />

4) Cracked Ceramic Packages: Ceramics are used for the<br />

majority of high reliability military <strong>and</strong> space<br />

applications. However, the packages are very brittle <strong>and</strong><br />

susceptible to cracking due to stress risers from either<br />

surface anomalies or general mounting. Root cause for<br />

1 – <strong>Willing</strong>, Fleisher & Cascio 2012 AR&MS Tutorial Notes


these issues can typically be traced to either design<br />

implementation or process control.<br />

5) Timing Issues: Inadequate timing margins are sometimes<br />

misdiagnosed as intermittent component behavior.<br />

Thorough timing analysis should be part of any design in<br />

particular when asynchronous signals are present.<br />

Figure 1. Defective Solder Joints<br />

6) Power Sequencing Issues: Many of the IC technologies are<br />

susceptible to damage if bias voltages are not properly<br />

applied prior to control or data input voltages.<br />

7) Design Implementation: Often component failures are<br />

related to poor design implementation rather than r<strong>and</strong>om<br />

defects in the components themselves. Examples include<br />

inadequate derating (voltage, power, <strong>and</strong> thermal),<br />

floating CMOS inputs, improper reset sequencing, or<br />

applying low bias voltages. The most common of these is<br />

due to mismanaging component thermal conditions <strong>and</strong><br />

operating parts outside their rated power dissipation<br />

limits.<br />

3. BASIC FAILURE ANALYSIS TECHNIQUES<br />

The basic flow for effective part failure analysis starts<br />

before the component is removed from the board. Upon<br />

completion of the board troubleshooting <strong>and</strong> fault isolation<br />

process, the cognizant <strong>Failure</strong> <strong>Analysis</strong> engineer should<br />

review the troubleshooting results while the part is still on the<br />

board witnessing any in-situ part measurements (for later<br />

verification in the FA lab) <strong>and</strong> noting any anomalies that exist<br />

on the board which may potentially have contributed to the<br />

part failure. Prior to removing a part from the board, it is also<br />

recommended to photograph the part as installed for future<br />

reference. Photos should be taken from various angles to<br />

capture the details of the installation, such as the solder<br />

attachment. In addition, contacting the vendor before<br />

removing high value parts is advised. Reviewing the failure<br />

data with the vendor can often identify external interfaces as<br />

the culprit rather than the suspected part. As some devices can<br />

cost many thous<strong>and</strong>s of dollars to replace, it is highly<br />

recommended that all resources available be used prior to<br />

replacing them.<br />

The <strong>Failure</strong> Analyst should also be consulted on the safest<br />

means for removing the part to preserve it to the greatest<br />

extent possible. Once the part is removed for failure analysis,<br />

three (3) basic processes should be followed:<br />

• Electrical Testing <strong>and</strong> part characterization<br />

• Non-Invasive tests<br />

• Invasive tests<br />

This general failure analysis process is illustrated in Table 1.<br />

Additional details pertaining to these tests <strong>and</strong> methods are<br />

discussed in this section<br />

Table 1. General <strong>Failure</strong> <strong>Analysis</strong> Process<br />

Electrical Testing / Characterization<br />

Test / Characterize over temperature<br />

Curve Tracer I-V check of Inputs<br />

Non-Invasive Tests<br />

External Microscopic Exam / Photo<br />

Fine & Gross Leak<br />

Vacuum Bake (Non-Hermetic <strong>Part</strong>s)<br />

X-ray<br />

PIND<br />

XRF<br />

SAM / C-SAM<br />

Invasive Tests<br />

Lid Removal / Decapsulate<br />

Die Examination<br />

Die Probing<br />

IR Microscopic Exam<br />

Liquid Crystal<br />

Cross-Sectioning<br />

SEM<br />

EDS/EDX<br />

FIB<br />

Auger<br />

SIMS<br />

FTIR<br />

TEM/STEM<br />

3.1 Electrical <strong>Part</strong> Testing <strong>and</strong> Characterization<br />

Electrical part testing <strong>and</strong> characterization is important, as<br />

it is necessary to confirm the part has indeed failed (if not, the<br />

fault may still exist at the board level) <strong>and</strong> to determine if<br />

there are any temperature, voltage or clock speed sensitivities<br />

associated with the part’s performance. All parts should be<br />

fully electrically tested at ambient, cold <strong>and</strong> hot temperatures<br />

to determine if the failure is sensitive to temperature. Another<br />

step in part characterization is to perform a curve tracer<br />

current vs. voltage (IV) characterization of each input signal<br />

2012 Annual RELIABILITY <strong>and</strong> MAINTAINABILITY Symposium <strong>Willing</strong>, Fleisher & Cascio – 2


(typically to ground) to determine if any input overstress have<br />

occurred. The IV characteristics of the failed part can be<br />

compared to a known good part with any deviations noted <strong>and</strong><br />

recorded for later die examination.<br />

Electrical Testing / Characterization Outline:<br />

Test / Characterize, over temperature, voltage, clock<br />

speed<br />

I/O Curve tracer assessments – Compare to known good<br />

devices<br />

3.2 Non-Invasive Examinations<br />

Once the failed components have been fully characterized<br />

via electrical testing, non-Invasive examinations can be<br />

performed. It is important to perform all necessary noninvasive<br />

tests <strong>and</strong> examinations first, so as to not destroy any<br />

“evidence” until a good set of non-invasive characteristics<br />

have been defined for the failed part.<br />

3.2.1 External Microscopic exam / Photo<br />

Using a stereo microscope, a thorough external visual<br />

examination of the suspect part should be performed early in<br />

the failure analysis process. Typical inspection scopes range<br />

from 10X to 30X magnification, which is usually sufficient to<br />

identify such items as external contamination <strong>and</strong>/or solder<br />

balls (possibly shorting out pins on the device), damaged leads<br />

or package seals, gross cracks in the package, etc.<br />

Magnification levels up to 100X can be employed to further<br />

examine any anomalies identified. The following conditions<br />

should be specifically looked for:<br />

• Contamination<br />

• Mechanical damage<br />

• Thermal or electrical damage<br />

• Seal integrity<br />

• Lead integrity<br />

Photographs should be taken to document the condition of the<br />

part <strong>and</strong> to record any anomalies.<br />

3.2.2 Fine & Gross seal tests for hermetic devices<br />

Hermeticity testing (refer to Mil-Std-883 Method 1014)<br />

should be performed on hermetic parts to ensure no leaks that<br />

could have allowed moisture to enter the package exist. Any<br />

internal moisture might result in possible corrosion or provide<br />

a conductive path on the semiconductor die surface, thereby<br />

causing a failure. A fine leak test often involves placing the<br />

part in pressurized helium (He) chamber in an attempt to force<br />

He into the device cavity through any leak sites, then moving<br />

the part to a Helium detection chamber to see if any He leaks<br />

out. Gross leak testing involves placing the part in a heated<br />

fluorocarbon bath <strong>and</strong> literally “Looking for Bubbles”. The<br />

heated bath causes the atmosphere within the package to<br />

exp<strong>and</strong>, forcing it through any large leak sites. It is important<br />

to perform both Fine Leak <strong>and</strong> Gross leak testing, as a Gross<br />

Leak site may be large enough to allow a full venting of the<br />

pressurized He, subsequently resulting in a false pass for the<br />

fine leak test. It is also important that the failed part be clean<br />

of any external epoxy or contamination that could absorb the<br />

He <strong>and</strong> provide a false positive reading. Newer optical leak<br />

test equipment using laser imaging of package lid deflection to<br />

confirm hermeticity is also available.<br />

3.2.3 Vacuum Baking<br />

If a non-hermetic part or cable is suspected to have a<br />

moisture related issue, a vacuum bake can be performed to<br />

drive out any residual moisture. If the problem disappears<br />

after the vacuum bake process, humidity could have been the<br />

cause. The authors were recently involved with a case where<br />

trapped moisture affected the performance of an RF cable.<br />

3.2.4 X-ray (Film, Real-time, 3D)<br />

Radiograph (refer to Mil-Std-883 Method 2012), often<br />

referred to as X-ray, is a very powerful tool for non-invasive<br />

failure analysis as X-ray can detect actual or potential defects<br />

within enclosed packages. There are multiple types of X-ray<br />

equipment available, from the basic film X-ray systems to<br />

real-time <strong>and</strong> 3-D X-ray systems. While film X-rays can be<br />

useful, the modern real-time X-ray provides a more extensive<br />

capability. Basic X-rays allow internal part examination<br />

looking for:<br />

• Internal particles<br />

• Internal wire bond dress<br />

• i.e. can make sure the wire bonds are not touching<br />

each other or package lids<br />

• Die attach quality (voiding, die attach perimeter)<br />

• Solder joint quality for connectors<br />

• Insufficient or excessive solder<br />

• Substrate or printed wiring board trace integrity<br />

• Obvious voids in the lid seal<br />

• Foreign metallic particles within the package<br />

• Internal part orientation, etc.<br />

The resolution of a basic film X-ray is typically to a 1 mil<br />

particle size, or bond wires to 1 mil diameter. The principal<br />

limitation of film X-ray is that it only allows one exposure<br />

level at a time. Not all characteristics can be observed at a<br />

single exposure level. Conversely, real-time X-ray typically<br />

has a resolution range from 1um to 0.4 um <strong>and</strong> allows for a<br />

continuous adjustment of exposure levels <strong>and</strong> conditions, as<br />

well as real time part rotation to obtain the most revealing Xray<br />

view. Special digital filtering <strong>and</strong> image processing can<br />

also be used to detect possible delineations in the image not<br />

otherwise observable on the image screen.<br />

3.2.5 PIND Test / <strong>Part</strong>icle Impact Noise Detection (PIND)<br />

Cavity device failures can be caused by internal<br />

conductive particles shorting adjacent conductors. While Xray<br />

techniques can be used to detect internal particles, another<br />

method is <strong>Part</strong>icle Impact Noise Detection (PIND), refer to<br />

Mil-Std-883 Method 2020. PIND Testing can be subjective<br />

<strong>and</strong> may not be easily performed on complex hybrids.<br />

However, it can provide evidence of internal particles. A<br />

common technique employed is to perform X-ray <strong>and</strong> PIND<br />

together; first an X-ray is taken, then the part is PIND tested,<br />

<strong>and</strong> then a second X-ray is taken. This allows one to identify<br />

particles that are free-floating within the package.<br />

3 – <strong>Willing</strong>, Fleisher & Cascio 2012 AR&MS Tutorial Notes


Section 5.5.3 discusses loose particles detected during<br />

PIND test.<br />

3.2.6 X-ray Fluorescence (XRF)<br />

X-ray Fluorescence (XRF) is a non-destructive technique<br />

used to determine the elemental composition of solid <strong>and</strong><br />

liquid samples. The X-rays excite atoms in the sample,<br />

causing them to emit X-rays with energies characteristic of<br />

each element present. The XRF equipment measures the<br />

energy <strong>and</strong> intensity of these X-rays <strong>and</strong> is capable of<br />

detecting elements from Al to U in the periodic table. XRF<br />

can determine concentrations ranging from parts per million to<br />

100% at depths as great as 10µm. Using reference st<strong>and</strong>ards,<br />

XRF can accurately quantify the elemental composition of the<br />

samples. XRF is commonly used to examine platings for pure<br />

tin content, as well as for cadmium <strong>and</strong> zinc [1].<br />

3.2.7 Acoustic tests (SAM / C-SAM)<br />

Acoustic testing is a popular test method to look for voids<br />

<strong>and</strong> delaminations or cracks in Plastic Encapsulated<br />

Microcircuits (PEMS) <strong>and</strong> ceramic capacitors. Acoustic tests<br />

rely on acoustic energy transfer through the part. If there is a<br />

void, the acoustic energy is blocked <strong>and</strong> voids can be detected.<br />

The acoustic tests can also be tuned to attempt to determine<br />

the depth of any void. Acoustic tests involve either reflected<br />

acoustic energy or energy transmitted through the part. Since<br />

the energy transmission medium is typically deionized water,<br />

parts to be examined must withst<strong>and</strong> exposure to water.<br />

3.2.8 Residual Gas <strong>Analysis</strong>, internal water vapor content<br />

Before transitioning to invasive examinations, for a<br />

hermetic part suspected of having an internal moisture issue,<br />

then Residual Gas <strong>Analysis</strong> (RGA) should considered once all<br />

non-invasive tests are performed. If a part only fails at cold<br />

temperature, an RGA test should be considered as cold<br />

temperature failures may be a result of excessive internal<br />

moisture condensing on the die surface. RGA (refer to Mil-<br />

Std-883 Method 1018) involves “Poking a Hole” through the<br />

device lid, using a vacuum to remove the interior gas <strong>and</strong><br />

performing a spectral analysis of the internal gases to<br />

determine their content. RGA can detect most of the gasses<br />

found within devices <strong>and</strong> report their individual<br />

concentrations. For water vapor, the maximum allowed<br />

concentration is typically 5000 ppm. This corresponds to the<br />

dew point (sublimation point) of -2C where the partial<br />

pressure of the H20 prevents any liquid condensation.<br />

3.3 Invasive Examinations; <strong>Part</strong> De-Lid / De-Process<br />

After all Non-Invasive examinations have been<br />

performed, it’s time to “Bite the Bullet” <strong>and</strong> dig deeper into<br />

the part. For cavity parts, this often involves a process called<br />

“delidding” where the device lid is removed, often by grinding<br />

down the lid around the seal ring or weld seal. For Plastic<br />

<strong>Part</strong>s, a chemical vapor deprocessing (desolving) of the<br />

encapsulant material must be performed. In either case, the<br />

goal is to expose the top chip surface to allow for visual<br />

examination. As “Flip Chip” devices become more popular,<br />

chip to substrate “de-stacking” will be required. For this<br />

process, sending the parts back to the original manufacturer is<br />

recommended. If a cavity device has been determined to<br />

contain an internal particle via X-ray or PIND testing, one<br />

technique that can be used to capture the particle is to first<br />

grind down the lid in one corner to the point where the cover<br />

thickness in the corner is very thin, then try to “shake” the<br />

particle down to that corner. Finally the corner can be<br />

carefully pealed back, exposing the particle of interest. A<br />

second option is to punch a small hole in the thinned lid <strong>and</strong><br />

cover it by adhesive tape. The part can then be run on the<br />

PIND tester until the noise stops. This procedure results in the<br />

particle being stuck on the tape.<br />

Figure 2 presents a part with the lid removed, for a failure<br />

associated with a melted wire bond.<br />

Figure 2. Device with lid removed– Revealing open wire<br />

bond.<br />

3.3.1 DIE Exams<br />

Once the top surface of the die is exposed, a microscopic<br />

die exam should be performed to look for obvious issues, such<br />

as damaged metal traces, die cracks, broken or damaged<br />

wirebonds, etc.<br />

These examinations are typically performed using a<br />

microscope at magnifications of 100X to 1000X. Deep UV<br />

optical microscopes can reach 16,000X magnification <strong>and</strong> are<br />

capable of resolving 10 microns. Microscopes equipped with<br />

both dark <strong>and</strong> light field illumination are helpful, as changing<br />

the lighting conditions can help reveal anomalies.<br />

Photographs should be taken to document the condition of the<br />

die <strong>and</strong> to record any anomalies.<br />

3.3.2 Die Probing<br />

If the failure analyst is familiar with the part die, probing<br />

using micro-manipulators <strong>and</strong> special probes can be<br />

performed to determine if any die metallization traces are<br />

2012 Annual RELIABILITY <strong>and</strong> MAINTAINABILITY Symposium <strong>Willing</strong>, Fleisher & Cascio – 4


shorted or open or to confirm an internal bias level. Detailed<br />

knowledge of the die design is necessary when performing<br />

this type of probing.<br />

3.3.3 Thermal imagining of die<br />

Quite often, defects on semiconductor die are associated<br />

with “hot spots”. These hot spots can be associated with<br />

shorts or circuits that are otherwise operating hotter than<br />

expected. There are two commonly used techniques to look<br />

for hot spots; an IR Microscope or liquid crystal die thermal<br />

mapping. Both techniques require the die to be biased, so it<br />

needs to be in a state where the leads can be connected or the<br />

die pads can be probed <strong>and</strong> voltages applied. The resolution<br />

of IR microscopes is on the order of 1 to 5 microns. The more<br />

accurate technique, especially when looking for point site<br />

defects, is the liquid crystal die thermal mapping. While a<br />

calibrated IR microscope can provide an actual die thermal<br />

measurement, the liquid crystal technique shows a relative<br />

hotspot as the liquid crystals change color with temperature.<br />

It has a higher resolution to determine exactly where the<br />

hotspot exists on the die. Once the hot spot is located, it can<br />

be further examined using high power microscope<br />

examinations, SEM or FIB, as discussed below.<br />

3.3.4 Wire Bond Pull Test (NDPT <strong>and</strong> DPT)<br />

As part of the invasive <strong>Failure</strong> analysis examination, Wire<br />

bonds should be checked, especially if a bad interconnect is<br />

suspected. A non-destructive pull test (NDPT) can be<br />

performed first (refer to Mil-Std-883 Method 2023) followed<br />

by an electrical retest of the part (if necessary). If a high<br />

resistance bond is still suspected, a destructive bond pull test<br />

(DPT) should be performed (refer to Mil-Std-883 Method<br />

2011). Wire bond pull strength depends on the type (Au, Al,<br />

etc.) <strong>and</strong> diameter of the wire. To gauge the proper bond pull<br />

strength, the “post-seal” bond strength requirements of<br />

Method 2011 should be considered (~ 80% of initial pull<br />

strength), to allow for some loss of bond strength with time<br />

<strong>and</strong> thermal exposure. For thermo-compression or thermosonic<br />

ball bonds, any bond pull failure where the entire ball<br />

bonds lifts off of the pad should be examined in more detail.<br />

These kinds of “ball lifts” are quite often a result of<br />

“Kirkendall voiding” <strong>and</strong> could represent a fundamental wire<br />

bond issue with the part. Section 5.1.1 discusses additional<br />

wire bond issues.<br />

3.3.5 Cross Sectioning<br />

Cross-Sectioning is a very important means of failure<br />

analysis. It is often used for connector, printed wiring board,<br />

substrate, solder joint, capacitor, resistor transformer,<br />

transistor <strong>and</strong> diode failure analysis. Cross-sectioning of<br />

semiconductor die can also be performed using a Focused Ion<br />

Beam (FIB). More information on FIB techniques is<br />

discussed in section 3.3.8. Prior to cross-sectioning, the<br />

sample is usually potted in a hard setting acrylic or polyester<br />

rosin. Cross-Sectioning is exactly as the name implies; the<br />

failed item is literary cut in a cross-sectioned fashion then<br />

highly polished to allow detailed microscopic examinations to<br />

be made. The potted sample can be cut in half initially to<br />

target the failure site, or the cross-section can commence at<br />

one end of the sample <strong>and</strong> then progressively continue up to<br />

<strong>and</strong> through the failure site. This progressive cross-sectioning<br />

can provide a “3D” view of the failure site. Of course,<br />

photographs should be taken at all cross-section points for<br />

documentation. Figure 3 is a cross-section of a solder joint.<br />

Figure 3. Solder Joint Cross-Section<br />

3.3.6 Scanning Electron Microscope (SEM)<br />

A Scanning Electron Microscope is an important tool for<br />

semiconductor die failure analysis, as well as metallurgical<br />

failure analysis. The SEM can provide detailed images of up<br />

to120,000 X magnification, with typical magnifications of<br />

50,000 to 100,000X <strong>and</strong> features resolution down to 25<br />

Angstroms. NANO SEMs can resolve features down to 10<br />

Angstroms.<br />

With a SEM image, the depth of field is fairly large,<br />

thereby providing a better overall three-dimensional view of<br />

the sample. While high power microscopes can reach 1000 X,<br />

the depth of field is usually very small <strong>and</strong> only features in a<br />

single plane can be examined. SEM examinations are often<br />

used to verify semiconductor die metallization integrity <strong>and</strong><br />

quality (refer to Mil-Std-883 Method 2018). Figure 4<br />

presents a SEM photo of a FET gate metallization structure.<br />

3.3.7 EDS/EDX<br />

Energy dispersive X-ray analysis, alternately known as<br />

EDS, EDAX or EDX, is a technique used along with a SEM<br />

to identify the elemental composition of a sample. During<br />

EDS, a sample is exposed to an electron beam inside the SEM.<br />

These electrons collide with the electrons within the sample,<br />

causing some of them to be knocked out of their orbits. The<br />

vacated positions are filled by higher energy electrons that<br />

emit X-rays in the process.<br />

By spectrographic analysis of the emitted X-rays, the<br />

elemental composition of the sample can be determined. EDS<br />

5 – <strong>Willing</strong>, Fleisher & Cascio 2012 AR&MS Tutorial Notes


is a powerful tool for microanalysis of elemental constituents<br />

[2].<br />

Figure 4. SEM photo of a FET gate metallization structure<br />

3.3.8 Focused Ion Beam (FIB)<br />

The Focused Ion Beam is a tool where an ion beam<br />

(typically a Gallium Liquid Metal Ion Source (LMIS)) is used<br />

to microscopically mil or ablate (e.g. ion milling) material<br />

away to allow for cross-sectioning of semiconductor die.<br />

Tungsten ion beams may also be used. The FIB cross-sections<br />

can be examined by Scanning Electron Microscope (SEM) to<br />

see features such as die metallization construction, pinhole in<br />

dielectrics (oxides/nitrides), any EOS, or ESD damage sites.<br />

The FIB cross sections are very “polished” revealing features<br />

at 100 Angstrom resolution. The FIB can also be used to cut<br />

semiconductor metallization lines to isolate circuitry on the<br />

die <strong>and</strong>, if necessary, a Platinum ion beam can be used to<br />

actually deposit metallization <strong>and</strong> create new circuit traces. In<br />

this case, die level design changes (known as “Device<br />

Editing”) can be implemented to allow for a design “try-out”.<br />

Figure 5 presents a FIB cross-section of a FET gate structure<br />

(see cut-out site in Figure 4).<br />

3.3.9 Auger Electron Spectroscopy (AES)<br />

Auger (“O-J”) analysis is a technique where samples are<br />

exposed to an electron beam designed to dislodge secondary<br />

electrons (otherwise known as Auger electrons) from the<br />

materials being examined. The materials can be identified by<br />

the different energy level spectra unique to each material’s<br />

valence b<strong>and</strong>s. Auger detection systems are useful for<br />

detecting organic materials on the surface of the die since<br />

Auger is more sensitive to lighter elements than EDS.<br />

While some depth profiling can occur, it is usually useful<br />

to 1um deep. Auger, like EDS, is an elemental technique that<br />

provides little compound information, but is most useful<br />

because it analyzes only the near surface region (~50<br />

Angstroms analysis depth). Figure 6 presents a Auger profile<br />

of the contamination on the surface of a wire bond pad.<br />

Figure 5. FIB cross-section of a FET gate structure<br />

(see cut-out site in Figure 4)<br />

Figure 6. Auger profile of contamination on the surface of a<br />

wire bond pad.<br />

SIMS is a technique that can detect very low<br />

concentrations of dopants <strong>and</strong> impurities. By ion milling<br />

deeper into the sample, SIMS can provide elemental depth<br />

profiles over a depth range from a few angstroms to tens of<br />

microns. SIMS works by sputtering the sample surface with a<br />

beam of primary ions. Secondary ions formed during<br />

sputtering are analyzed with a mass spectrometer. These<br />

secondary ions can range down to sub-parts-per-million trace<br />

levels [3].<br />

Advanced SIMS analyses, such as Time-of-Flight SIMS<br />

(TOF-SIMS) <strong>and</strong> Dynamic SIMS (D-SIMS), provide<br />

additional means of elemental detection <strong>and</strong> resolution.<br />

3.3.10 Fourier Transform Infrared Spectroscopy (FTIR)<br />

Fourier Transform Infrared Spectroscopy is an analytical<br />

technique used primarily to identify organic materials, such as<br />

solder flux contamination associated with a part failure. The<br />

2012 Annual RELIABILITY <strong>and</strong> MAINTAINABILITY Symposium <strong>Willing</strong>, Fleisher & Cascio – 6


FTIR reveals infrared absorption spectra that provides<br />

information about the chemical bonds <strong>and</strong> molecular structure<br />

of a material. The FTIR spectrum is like a "fingerprint" of the<br />

material; however, the fingerprint itself is not like a typical<br />

spectrum with known peaks for each element. When running<br />

an FTIR analysis, it helps to compare FTIR spectrums to<br />

known samples as it can be difficult to determine the exact<br />

components of the material just from the spectra itself.<br />

Cataloged FTIR spectra exist to help identify the materials.<br />

FTIR samples of the materials most suspect to be the culprit<br />

are often taken <strong>and</strong> then compared to the contamination<br />

sample’s FTIR “fingerprint”. Unfortunately, most FTIR<br />

equipment requires a fairly large sample of the material in<br />

question, which is often not available with typical failures [4].<br />

3.3.11 TEM (transmission electron microscopy<br />

STEM (scanning transmission electron microscopy)<br />

Transmission Electron Microscopy (TEM) <strong>and</strong> Scanning<br />

Transmission Electron Microscopy (STEM) use a high energy<br />

electron beam to image through an ultra-thin sample, thereby<br />

allowing for image resolutions on the order of 1 - 2<br />

Angstroms. S/TEM has better spatial resolution then a<br />

st<strong>and</strong>ard SEM <strong>and</strong> is capable of additional analytical<br />

measurements. However, S/TEM requires significantly more<br />

sample preparation as samples need to be very thin, created by<br />

using FIB techniques.<br />

S/TEM provides outst<strong>and</strong>ing image resolution making it<br />

is possible to characterize crystallographic phase,<br />

crystallographic orientation (both by diffraction mode<br />

experiments), produce elemental maps (using EDS), <strong>and</strong><br />

generate images that highlight elemental contrast (dark field<br />

mode)—all from nm sized areas that can be precisely located<br />

[5].<br />

3.3.12 ESD Testing<br />

If a part is suspected to be damaged by Electrostatic<br />

Discharge (ESD), it is advisable to subject a known good part<br />

to ESD testing <strong>and</strong> compare the results to the failed device in<br />

question (Reference Mil-Std-883 Method 3015, JEDEC <strong>and</strong><br />

ESD Association Std ANSI/ESDA/JEDEC JS-001-2010).<br />

4. SUGGESTIONS FOR YOUR OWN FAILURE ANALYSIS<br />

CAPABILITIES<br />

This section provides some suggestions for establishing<br />

<strong>Failure</strong> <strong>Analysis</strong> capabilities for a typical electronics firm.<br />

Three levels of <strong>Failure</strong> <strong>Analysis</strong> capabilities are suggested;<br />

Basic, Moderate <strong>and</strong> Advanced. Beyond these three levels,<br />

one might consider using commercial failure analysis<br />

laboratories for the more esoteric capabilities such as TEM,<br />

STEM or SIMS. Usually it is more cost effective to<br />

subcontract out those types of analyses vs. establishing their<br />

capabilities in-house.<br />

Basic <strong>Failure</strong> <strong>Analysis</strong> Lab<br />

• Basic Meters (DVMMs)<br />

• Stereo Microscope (10X to 30X)<br />

(Preferably with digital camera)<br />

• Cross Sectioning Equipment<br />

• Power Supplies / Signal generator<br />

• Oscilloscope<br />

Moderately Equipped <strong>Failure</strong> <strong>Analysis</strong> lab<br />

• SEM<br />

• Curve Tracer<br />

• Metallurgical Microscope (1000X)<br />

(Preferably with digital camera)<br />

• Chemical hood with decapsulating chemicals<br />

• Die Probe Station<br />

• Liquid Chrystal<br />

• Film X-ray<br />

Advanced <strong>Failure</strong> <strong>Analysis</strong> lab<br />

• Real Time X-ray<br />

• SEM/EDS<br />

• FIB<br />

• Auger <strong>Analysis</strong> System<br />

• RF Test Equipment (If necessary)<br />

5. UNDERSTANDING ELECTRONIC PART FAILURE<br />

MECHANISMS [6]<br />

Excerpts from the 1997 Alan O. Plait Award for Tutorial<br />

Excellence<br />

This section describes failure mechanisms commonly<br />

encountered with electronic parts. Figure 7 illustrates three<br />

common part styles; a Transistor, Hybrid, <strong>and</strong> an Integrated<br />

Circuit IC). The Hybrid contains multiple devices, including<br />

resistors <strong>and</strong> capacitors, along with semiconductors <strong>and</strong> ICs.<br />

Figure 7. Typical Transistor, Hybrid <strong>and</strong> IC<br />

In this section, examples of failures specific to each part<br />

type are reviewed, with guidelines to help choose the most<br />

effective corrective action. There are five subjects covered:<br />

• Interconnects<br />

• Semiconductor elements<br />

• Passive elements<br />

• Substrates<br />

• Packages.<br />

7 – <strong>Willing</strong>, Fleisher & Cascio 2012 AR&MS Tutorial Notes


5.1 Interconnects<br />

Interconnects within components connect circuit elements<br />

<strong>and</strong> substrates to each other <strong>and</strong> to the device package. Wire<br />

bonding is used to electrically connect circuit elements to<br />

substrates, to package pins, <strong>and</strong> to other circuit elements<br />

within a package. Soldering is used both to physically attach<br />

circuit elements to substrates or package headers <strong>and</strong> to<br />

physically attach substrates to package headers. It also<br />

provides a thermal path for heat dissipation. In many cases,<br />

soldering also serves to establish an electrical connection.<br />

Epoxy serves the same basic function as solder, to attach<br />

circuit elements to substrates or headers. Conductive epoxy is<br />

used in place of nonconductive epoxy when an electrical<br />

connection is also needed.<br />

5.1.1 Wire Bonding<br />

Wire bonding in microelectronics is generally performed<br />

in one of two ways; thermo-sonic ball <strong>and</strong> stitch bonding or<br />

ultrasonic wedge bonding. In thermo-sonic wire bonding, fine<br />

gold wire (typically 1 mill diameter) is used on a heated stage<br />

(~ 150C). A ball is formed at the end of the wire via an<br />

electronic arc (older machines used a hydrogen gas flame) <strong>and</strong><br />

the ball is bonded to the contact bond pad by the heat of the<br />

stage, the force <strong>and</strong> ultrasonic energy applied by the wire<br />

bonding machine capillary. This is called a ball-bond. The<br />

capillary is then raised <strong>and</strong> moved to the next bonding site<br />

where temperature <strong>and</strong> pressure form another bond (called a<br />

stitch bond). In ultrasonic bonding, aluminum wire is<br />

generally used. There is no heated stage used in this process<br />

<strong>and</strong> the pressure of the wire bonding machine on the wire is<br />

incidental. Most of the energy is supplied by high-frequency<br />

acoustical movement of the wire against the bonding area.<br />

This energy is sufficient to break through the oxides<br />

surrounding the wire or bonding surface. The wire is cut<br />

instead of being flamed off.<br />

The reliability of a wire bond using any of these methods<br />

is affected by bond placement, wire dress, bonding energy,<br />

bonding temperature, bondability of the surface, <strong>and</strong> any<br />

dissimilar metals used.<br />

Incorrect bond placement on a bonding pad can result in<br />

shorts to nearby metallization tracks. This can also result from<br />

using a too large diameter wire for the bonding target. Wire<br />

dress refers to how wire bonds are routed <strong>and</strong> to the amount of<br />

stress relief used in the wire. Improper routing can cause wire<br />

bonds to short to other wire bonds or to conductors in a<br />

package. Insufficient stress relief can cause wires to break or<br />

lift off of the bond pad during thermal excursions. Excessive<br />

stress relief can allow a wire bond to short to the lid of the<br />

package.<br />

Bonding energy is the amount of energy used to form the<br />

bond. In ultrasonic wire bonding, excessive bonding energy<br />

(ultrasonic) can result in an unacceptable thinning of the wire<br />

at the heel or in microcracking in the underlying silicon. This<br />

could lead to a break in the wire at the heel or a chipout at the<br />

bond pad. In thermo-sonic wire bonding, too much pressure<br />

can deform the ball <strong>and</strong> cause damage to the bond pad.<br />

Insufficient bonding energy can cause weak bonds with all<br />

technologies. The bonding temperature is important in the<br />

thermo-sonic bonding. If the bonding temperature is too low, a<br />

weak bond may result. The use of dissimilar metals, usually<br />

gold <strong>and</strong> aluminum, can also be a source of failures. While the<br />

formation of gold/aluminum intermetallics are necessary to<br />

form a metallurgical bond between the two metals, voiding at<br />

the intermetallic sites (Kirkendall voiding) can cause high<br />

electrical resistance <strong>and</strong> low mechanical strength. Bondability<br />

refers to the ability of the two bonding surfaces to form a good<br />

bond. Contamination by foreign substances, incomplete<br />

photoresist removal, incomplete oxide removal, or incomplete<br />

nitride removal all affect bondability. This may result in the<br />

inability to form a bond or in a weak, highly resistive bond<br />

that will eventually fail. Contamination can greatly increase<br />

the formation of Kirkendall voids in a bimetallic system.<br />

5.1.2 Soldering<br />

Soldering is used in microelectronic parts to attach circuit<br />

elements to a substrate or a package header <strong>and</strong> substrates to<br />

package headers. Eutectic bonding, the attachment of circuit<br />

elements to a package header or substrate using a eutectic<br />

material system, will also be discussed in this section. The<br />

eutectic composition of a material system (if there is one) is<br />

the composition of elements that give the lowest melting<br />

temperature. The most common eutectic attachment system<br />

used in microelectronics is the gold/silicon system, which<br />

melts at about 370°C. Die attach serves three basic functions<br />

in a part; it physically attaches the circuit elements to a<br />

substrate or header, it provides a thermal path for heat<br />

dissipation, <strong>and</strong> in many cases, provides an electrical<br />

connection for the circuit. The optimum die attach would have<br />

100% of the die's underside in contact with the header or<br />

substrate. In reality, due to either surface irregularities (die,<br />

substrate), a die attachment process problem, or<br />

contamination, the die attach usually contains some voiding.<br />

The voids interrupt the thermal path used to remove the heat<br />

from the die. Depending on the severity of the voiding <strong>and</strong> the<br />

power dissipation in the die, the die may fail from<br />

overheating. In extreme cases, poor die attach can result in an<br />

electrically open condition <strong>and</strong> the die breaking free of the<br />

header or substrate (refer to Figure 8).<br />

Substrate attach using solder is similar to die attach with<br />

solder. Various active <strong>and</strong> passive elements are bonded to a<br />

substrate that is then soldered to a package. Substrate attach<br />

affords the substrate the same benefits that die attach affords<br />

the die in that it provides the substrate with physical<br />

attachment, a thermal path, <strong>and</strong> in some cases, an electrical<br />

path. Voiding in the substrate attach solder is a major<br />

concern.<br />

Corrosion of indium solder joints, used for their ductile<br />

property, can occur when subjected to high humidity<br />

environments. Therefore, it is important to assemble the<br />

device in a dry environment <strong>and</strong> ensure it is contained in a<br />

hermetically sealed package.<br />

Indium <strong>and</strong> gold solder joints also form extremely brittle<br />

intermetallics when exposed to temperatures above 70 to 80C,<br />

2012 Annual RELIABILITY <strong>and</strong> MAINTAINABILITY Symposium <strong>Willing</strong>, Fleisher & Cascio – 8


under humid or dry conditions.<br />

Figure 8. Poor Die Attachment<br />

5.1.3 Epoxy<br />

Epoxy can be used instead of solder in many<br />

microelectronic part assembly processes. Epoxies, both<br />

conductive (usually silver filled) <strong>and</strong> nonconductive, can be<br />

applied to accomplish die attach <strong>and</strong>/or substrate attach <strong>and</strong><br />

have become more popular as the quality of micro-electronic<br />

grade epoxies has improved.<br />

Conductive epoxy is selected when an electrical<br />

connection is also required. The advantages of using epoxy<br />

include ease of application, low temperature curing, <strong>and</strong><br />

reworkability. Epoxies do, however, display several failure<br />

mechanisms. Improperly cured epoxy can outgas inside a<br />

hermetic package after it has been sealed, releasing moisture<br />

<strong>and</strong> ionic contaminants into the internal cavity of the package.<br />

Because of their inherent charge, these ionic contaminants<br />

may shift the electrical parameters of electronic devices in the<br />

package. This is of particular concern when Metal Oxide<br />

Semiconductor (MOS) devices are present. Adhesive ionic<br />

contaminant issues can be mitigated by selecting epoxies that<br />

meet Mil-Std-883 Method 5011 requirements. Poor adhesion<br />

of an epoxy to either the die or the substrate is another failure<br />

mechanism for epoxy. This type of failure is usually caused by<br />

improper cleaning or abrading of either joining surface.<br />

If stable electrical resistance of the attachment is critical<br />

to circuit performance, conductive epoxy may not be the best<br />

choice as earlier formulations exhibited changes in the<br />

electrical resistance over time. It can also be affected by<br />

factors such as temperature <strong>and</strong> humidity. Electrolytic<br />

corrosion can occur in silver filled conductive epoxy when<br />

sufficient moisture is present in a package. The silver from the<br />

epoxy is corroded by the moisture <strong>and</strong> by other substances in<br />

the epoxy. It can then be transported under the influence of an<br />

electric field in the package <strong>and</strong> cause shorting to adjacent<br />

metallization tracks or components.<br />

5.2 Semiconductor Elements<br />

Semiconductor elements include discrete diodes, discrete<br />

transistors, <strong>and</strong> integrated circuits. The semiconductor<br />

elements can be packaged individually or grouped together in<br />

a hybrid configuration. Semiconductor element failures can be<br />

broken down into the three categories of metallization failures;<br />

oxide failures, <strong>and</strong> failures induced by overstress.<br />

5.2.1 Metallization<br />

Metallization on a semiconductor element is a thin film<br />

pattern of metal deposited on a chip to connect electronic<br />

components contained on the chip or to establish contacts that<br />

may be connected externally. Metallization failures generally<br />

result in electrical opens, although shorts may also be<br />

experienced. Metallization failures can be divided into the<br />

following specific categories; step coverage, electromigration,<br />

misalignment, corrosion, mechanical damage, <strong>and</strong> stress<br />

voiding.<br />

Step coverage on a semiconductor element refers to the<br />

thickness of a material deposited on an area with an uneven<br />

topography. A change in the vertical direction is called a step.<br />

Thinning in the metallization (usually aluminum) over a step<br />

is allowed to reduce to 50% of the metal thickness over a flat<br />

area. If step coverage is poor (less than 50%), open circuits<br />

can result. Modern IC’s have multilayer planarized<br />

metallization which eliminates many of the step issues.<br />

Electromigration of metal results in an open circuit<br />

condition. Electromigration is caused by a thermal activation<br />

of aluminum ions that are physically moved by momentum<br />

exchange with flowing electrons. Electromigration failures<br />

are a function of the current density in an aluminum conductor<br />

<strong>and</strong> its temperature. Usually, design rules preclude this<br />

current density from being exceeded. Mil-Prf-38535, for<br />

example, specifies that the current density for glassivated<br />

aluminum metallization shall not exceed 5x10 5 A/cm 2 for case<br />

operating temperatures up to 125°C. Defects in the<br />

metallization, such as poor step coverage or voiding, can<br />

allow localized areas of current constriction to occur.<br />

Misapplication of a device in a circuit can also lead to<br />

excessive current densities. Misaligned metallization on an<br />

integrated circuit can result in poor contact to active circuit<br />

elements or to other metallization levels. This type of defect<br />

is caused by poorly aligned masks during fabrication. <strong>Failure</strong>s<br />

in the form of opens can result from this defect.<br />

Corrosion of aluminum metallization is another failure<br />

mechanism. Corrosion can occur due to the introduction of<br />

contaminants during processing or due to moisture penetrating<br />

into the cavity of a non-hermetic package. Aluminum bond<br />

pads are especially susceptible because they are not<br />

passivated. Corrosion can also occur if moisture is<br />

inadvertently sealed in a hermetic package.<br />

Mechanical damage to metallization can be introduced<br />

during probing or h<strong>and</strong>ling. This is especially true in hybrid<br />

microcircuits, which are exposed to a large number of<br />

assembly steps. Mechanical damage to metallization can<br />

result in shorts or opens. Stress voiding is a relatively new<br />

9 – <strong>Willing</strong>, Fleisher & Cascio 2012 AR&MS Tutorial Notes


failure mechanism that has been identified. Voids form in the<br />

aluminum metallization on an integrated circuit due to a<br />

tensile stress that is exerted on it by the passivation. The voids<br />

tend to occur at aluminum grain boundaries. Void formation is<br />

highly dependent on device geometry, processing, <strong>and</strong> the<br />

particular metallization system used.<br />

5.2.2 Overstress<br />

Overstress refers to the application of voltage or current,<br />

or a combination of the two (power), to a device that exceeds<br />

its capabilities. Irreversible damage can result in the<br />

metallization, oxide, semiconductor material, etc. Overstresses<br />

can be divided into two basic groups: electrical overstress<br />

(EOS) <strong>and</strong> electrostatic discharge (ESD).<br />

Electrical overstress is one of the most common causes of<br />

failure for an electronic device. It can be a continuous event or<br />

it can be transient in nature. An EOS failure can be caused by<br />

the failure of another device in a circuit, the misapplication of<br />

a device in a circuit, or the external application of excessive<br />

power to a device. One of the most challenging aspects of<br />

failure analysis can be to determine whether a device failed<br />

from an internal defect or an external overstress. The damage<br />

that results from EOS can range from the leakage of a single<br />

gate in a Very Large Scale Integration (VLSI) device to the<br />

fusing of a discrete power transistor.<br />

Electrostatic discharge is the transfer of charge between<br />

two bodies that are at different potentials. Semiconductor<br />

elements are sensitive to ESD. Sources of static for ESD<br />

include work surfaces, plastic bags, <strong>and</strong> the human body. The<br />

ESD event itself is a transient phenomenon. It can be modeled<br />

as a capacitor discharging through a resistor. Generally,<br />

semiconductor elements exposed to sufficiently high levels of<br />

ESD will experience varying degrees of damage. Many times<br />

ESD damage is very subtle. This is because the ESD event is<br />

very short in duration, usually about 200 nanoseconds when<br />

the source is a human body. The control of ESD is now itself<br />

an industry that supports electronics manufacturers.<br />

5.2.3 Oxides / Nitrides<br />

Oxides (silicon dioxide) <strong>and</strong> Nitrides (silicon nitride)<br />

serve to provide an insulating barrier between conductors or<br />

between semiconductors <strong>and</strong> conductors. They are also used<br />

as a passivation layer to protect the underlying structures.<br />

Oxides can be deposited on a silicon chip or can be thermally<br />

grown. There are three oxide/nitride failure mechanisms that<br />

will be discussed here: ionic impurities, oxide defects, <strong>and</strong> hot<br />

carrier effects. Ionic impurities can contaminate the<br />

oxide/nitride <strong>and</strong> affect device operation, particularly in MOS<br />

devices. Sodium ions, which are highly mobile in oxide,<br />

were a common impurity found in early semiconductor<br />

processes. These ions, when affected by an electrical bias, can<br />

migrate in the oxide <strong>and</strong> cause degraded device operation or<br />

failure. Generally these “Mobile Ion” failure mechanisms<br />

have been eliminated from semiconductor processing.<br />

However, if they should occur, stressing the oxide with the<br />

appropriate voltage can screen out such devices. Another<br />

failure mechanism caused by ion migration in oxide is time<br />

dependent dielectric breakdown, <strong>and</strong> it is not as easily<br />

screened out. In this case, the ions are emitted into the oxide<br />

from a gate metal during the operation of the device. Again,<br />

degraded performance or failure can result. There are design<br />

criteria that are used to limit this phenomenon.<br />

Physical defects in an oxide can cause failure, particularly<br />

in the thin gate oxides of MOS devices. Pin holes in the oxide<br />

can reduce its dielectric strength <strong>and</strong> result in breakdown.<br />

Severely thinned oxides can also reduce dielectric strength <strong>and</strong><br />

cause breakdown.<br />

Hot carrier electrons can cause failures in integrated<br />

circuits. Hot carrier electrons are very energetic electrons<br />

which can affect the oxide by forming trapped charge regions,<br />

resulting in device failure. They are more troublesome in<br />

small geometry devices (found it VLSI devices), where<br />

geometries are shrunk but operating voltages (usually + 3.3<br />

volts, down to +1.0 volts) are held constant. Unique VLSI<br />

processing techniques can leave subtly damaged oxide which<br />

may result in more trapped charges.<br />

5.3 Passive Elements<br />

Passive elements used in microelectronics include<br />

resistors <strong>and</strong> capacitors. There are many different types of<br />

capacitors to choose from. Ceramic capacitors are probably<br />

the most common style of capacitor used. When a large<br />

amount of capacitance is required in a small volume, tantalum<br />

is usually the choice. Resistors can be produced by both thick<br />

film <strong>and</strong> thin film technologies.<br />

5.3.1 Capacitors<br />

Ceramic capacitors are named for their ceramic dielectric<br />

material. They generally have two sets of interleaved plating<br />

to increase the area of the plates <strong>and</strong> thereby increase their<br />

capacitance. End caps are added to join the two sets of plates.<br />

One of the most common failure modes for ceramic capacitors<br />

occurs during their soldering to substrates or boards. The<br />

thermal shock associated with the soldering of the capacitors<br />

can cause the ceramic to crack. If the crack extends between<br />

plates of opposite polarity, the device's dielectric breakdown<br />

voltage will drop off <strong>and</strong> the device will short when voltage is<br />

applied to it. Barrier metals must be used on end caps so that<br />

solder leaching will not occur.<br />

Tantalum capacitors use tantalum pentoxide as a<br />

dielectric. Their typical failure mode involves cracked<br />

connections.<br />

5.3.2 Resistors<br />

The term thick film resistor refers to the way the resistor<br />

was fabricated. Thick film technology is a field of<br />

microelectronics in which special pastes are silk-screened onto<br />

a ceramic substrate <strong>and</strong> then fired at high temperature to bond<br />

the films to the substrate. Thick film resistors are widely used<br />

in hybrids. <strong>Failure</strong> mechanisms include poor adhesion <strong>and</strong><br />

EOS. Thin film resistors are fabricated utilizing thin film<br />

technology. Thin film technology refers to the deposition of a<br />

material (usually less than 5 microns in thickness) onto a<br />

substrate by vacuum deposition or sputtering. <strong>Failure</strong> modes<br />

2012 Annual RELIABILITY <strong>and</strong> MAINTAINABILITY Symposium <strong>Willing</strong>, Fleisher & Cascio – 10


include poor adhesion, cracking, EOS, <strong>and</strong>, ESD due to their<br />

thin film nature.<br />

5.4 Substrates<br />

Substrates are used in microelectronics, particularly when<br />

manufacturing hybrids, to mount circuit elements onto <strong>and</strong> to<br />

make electrical interconnections. Substrates, typically formed<br />

out of a ceramic material, save space inside a package <strong>and</strong><br />

reduce its weight. Substrates can fail from several different<br />

mechanisms as discussed below.<br />

5.4.1 Cracking<br />

Cracking in a substrate can cause a failure if the substrate<br />

crack propagates through a metallization stripe. A crack in a<br />

substrate can also propagate through an attached component (a<br />

die, for example) causing the component to fail. Cracks in a<br />

substrate can be caused by a thermal coefficient of expansion<br />

mismatch between a substrate <strong>and</strong> a package header. They can<br />

also be introduced by mechanical damage.<br />

5.4.2 Metallization<br />

Metallization failures, which were shown to occur on<br />

semiconductor die, also occur on substrates. Lifting of the<br />

metallization from the substrate can occur, usually resulting<br />

from an improperly cleaned substrate prior to metallization<br />

application. Poor metallization coverage is also a failure<br />

mechanism. Leaching of gold metallization into solders can<br />

also occur if the proper barrier metals are not used.<br />

5.4.3 Multilayer Substrates<br />

Multilayer substrates (substrates with two or more levels<br />

of metallization) suffer from the same failure mechanisms as<br />

single layer substrates, with two additions. Incomplete via fills<br />

(a via is an internal connection between two metallization<br />

layers) occur during substrate fabrication <strong>and</strong> result in open<br />

circuits. Shorts between metallization layers also happen<br />

during substrate fabrication.<br />

5.5 Packages<br />

Packages physically protect circuit elements from the<br />

external environment. They also allow for electrical<br />

connection to other packages in an electrical system. The<br />

failure of a package to protect its internal components from<br />

the external environment can result in device failure. Package<br />

failures can be classified as hermeticity failures, insulation<br />

resistance failures, or failures caused by loose particles within<br />

the package.<br />

5.5.1 Hermeticity<br />

Microelectronics packages are either hermetic or<br />

nonhermetic. Hermetic packages effectively seal the internal<br />

components from the external atmosphere. Nonhermetic<br />

packages (plastic packages) allow outside air to penetrate the<br />

package.<br />

Moisture can lead to many forms of corrosion inside a<br />

package <strong>and</strong> is one of the most important contaminants to seal<br />

out. Hermetic seals require the use of some combination of<br />

metal. glass, <strong>and</strong>/or ceramic in the package seal. Devices that<br />

fail hermeticity are referred to as fine leakers or gross leakers.<br />

A fine leak is defined as a leak rate that is greater than 1 x 10-<br />

7 atm cc/sec (however this rate does depend on the package<br />

volume). A gross leak is any leak rate greater than 1 x 10-5<br />

atm cc/sec, usually detectable by looking for bubbles from a<br />

package while immersed in a hot fluorocarbon.<br />

5.5.2 Insulation Resistance<br />

Insulation resistance between package pins <strong>and</strong> leads must<br />

be maintained for a device to function properly.<br />

Contamination on the exterior of a package can cause the<br />

insulation resistance to fail. Leaching of lead from the glass<br />

sealing material has historically caused insulation resistance<br />

failures.<br />

5.5.3 Loose <strong>Part</strong>icles<br />

Loose particles inside a package can cause a failure. This<br />

is especially true if the particles are conductive. A loose<br />

conductive particle in a package can cause a failure by<br />

creating a short between other conductors inside the package.<br />

<strong>Part</strong>icle Impact Noise Detection (PIND) testing is used to<br />

detect loose particles inside a package. Radiographic<br />

examination can then be used to verify the size <strong>and</strong> density of<br />

the particle before the device is delidded.<br />

6. FAILURE ANALYSIS CASE STUDY<br />

This section discusses the failure analysis performed on a<br />

hermetically packaged integrated circuit (multiplexer). The<br />

failure was caused by corrosion within the package. This<br />

example presents the types of problems that are encountered<br />

<strong>and</strong> how proper failure analysis can help implement effective<br />

corrective action.<br />

6.1 Mux <strong>Failure</strong> <strong>Analysis</strong><br />

A multiplexer Integrated Circuit (Mux IC) failure was<br />

first discovered during a system level electrical test. The<br />

microcircuit used a st<strong>and</strong>ard high reliability package design<br />

consisting of a ceramic housing with a hermetic seal. Prior to<br />

the failure, the multiplexer was exposed to multiple<br />

temperature performance <strong>and</strong> environmental screening tests at<br />

the component <strong>and</strong> board level assembly. It was not until<br />

integration at a higher level assembly that an anomaly arose.<br />

The initial trouble shooting quickly isolated the problem to the<br />

multiplexer. At the time, the anomalous behavior was seen<br />

only during electrical testing below 0°C. After careful<br />

assessment of the part as installed on the board, it was<br />

removed for further investigation. The part was photographed<br />

<strong>and</strong> leak checked as a normal course of action. It passed the<br />

fine <strong>and</strong> gross leak check. The part was then retested<br />

electrically at low temperature to demonstrate the issue was<br />

reproducible at the component level. The next step was<br />

performing real-time X-ray, which observed a possible open<br />

circuit, refer to figure 9.<br />

After exhibiting similar anomalous behavior it was<br />

delidded for internal inspection. The inspection revealed<br />

signification corrosion; refer to figure 10.<br />

11 – <strong>Willing</strong>, Fleisher & Cascio 2012 AR&MS Tutorial Notes


The corrosion primarily attacked the wire bond for Vcc<br />

which brings in external DC power. The interconnect was<br />

degraded to the point of being intermittent over temperature.<br />

The cause of corrosion is typically due to moisture trapped in<br />

packages prior to seal. Moisture can react with residual<br />

plating salts <strong>and</strong> cause significant corrosion between<br />

interconnecting joints especially in presence of an electrical<br />

field.<br />

Figure 9. Real-time X-ray of Multiplexer IC Revealed Possible<br />

Corrosion<br />

Figure 10. Multiplexer IC After Lid Removal revealing<br />

corrosion on Pin 13<br />

Military-St<strong>and</strong>ard packages require Residual Gas<br />

<strong>Analysis</strong> Test (RGA) as a qualification for low moisture<br />

content (i.e.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!