10.02.2015 Views

A Programmable BIST Core for Embedded DRAM - Laboratory for ...

A Programmable BIST Core for Embedded DRAM - Laboratory for ...

A Programmable BIST Core for Embedded DRAM - Laboratory for ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

.<br />

A <strong>Programmable</strong> <strong>BIST</strong> <strong>Core</strong><br />

<strong>for</strong> <strong>Embedded</strong> <strong>DRAM</strong><br />

CHIH-TSUN HUANG<br />

JING-RENG HUANG<br />

CHI-FENG WU<br />

CHENG-WEN WU<br />

TSIN-YUAN CHANG<br />

National Tsing Hua University<br />

The programmable <strong>BIST</strong><br />

design presented here<br />

supports various test<br />

modes using a simple<br />

controller. With the<br />

March C − algorithm, the<br />

<strong>BIST</strong> circuit’s overhead is<br />

under 1.3% <strong>for</strong> a 1-Mbit<br />

<strong>DRAM</strong> and under 0.3%<br />

<strong>for</strong> a l6-Mbit <strong>DRAM</strong>.<br />

WITH THE ADVENT OF deep-submicron<br />

VLSI technology, ASIC vendors are turning<br />

toward single-chip systems that integrate<br />

cores from various sources. Memory is one<br />

of the most universal cores—almost all system<br />

chips contain some type of embedded<br />

memory. System designers have used embedded<br />

static RAMs (SRAMs) widely, since<br />

by merging memory with logic, they can increase<br />

data bandwidth and reduce hardware<br />

cost. Now, with pad-limited, multimilliongate<br />

designs, embedded dynamic RAM<br />

(<strong>DRAM</strong>) is also becoming an attractive core.<br />

It provides high-capacity storage at a higher<br />

data rate than commodity <strong>DRAM</strong>, whose<br />

data rate is limited by the number of pins<br />

available. <strong>Embedded</strong> <strong>DRAM</strong> also reduces<br />

overall power consumption and hardware<br />

cost. Merging <strong>DRAM</strong> and logic promises to<br />

benefit the system-IC industry. In fact, the<br />

combination has begun to appear in various<br />

ASIC and microprocessor designs and advanced<br />

computer architectures. 1<br />

Of course, merging <strong>DRAM</strong> and logic poses<br />

challenges, such as ensuring process optimization<br />

and creating design and test<br />

methodologies that guarantee per<strong>for</strong>mance,<br />

quality, and reliability. Testing embedded<br />

<strong>DRAM</strong>s is more difficult than testing commodity<br />

<strong>DRAM</strong>s. 2 One test issue is accessibility.<br />

When the <strong>DRAM</strong> core is embedded in a<br />

chip and surrounded by logic blocks, accessing<br />

it from an external memory tester is<br />

costly. A system requires design <strong>for</strong> testability<br />

<strong>for</strong> core isolation and tester access, and<br />

this exacts a price in hardware overhead,<br />

per<strong>for</strong>mance penalties, and noise and parasitic<br />

effects. Even if this price is manageable,<br />

testers <strong>for</strong> full qualification and testing of embedded<br />

<strong>DRAM</strong>s are much more expensive<br />

due to the increased speed and I/O data<br />

width of embedded memories (compared<br />

with packaged commodity memories). If we<br />

also consider engineering change, the overall<br />

investment is even higher.<br />

A promising solution to this dilemma is<br />

built-in self-test. Researchers have proposed<br />

many <strong>BIST</strong> schemes <strong>for</strong> embedded memories.<br />

3-8 <strong>BIST</strong> minimizes the embedded<br />

<strong>DRAM</strong>’s tester requirement and greatly reduces<br />

memory tester time throughout the test<br />

flow. It also reduces total test time since parallel<br />

testing at the memory bank and chip<br />

levels is easier. <strong>BIST</strong> is also a good way to protect<br />

the intellectual property contained in the<br />

core. The embedded <strong>DRAM</strong> core provider<br />

need only deliver the <strong>BIST</strong> activation and response<br />

sequences <strong>for</strong> testing and diagnosis<br />

without disclosing the detailed design.<br />

Table 1 (next page) shows a simplified embedded<br />

<strong>DRAM</strong> test flow, which has fewer test<br />

runs than the typical <strong>DRAM</strong> test flow. The<br />

table compares the required test support with<br />

and without <strong>BIST</strong>. The memory tester can<br />

JANUARY–MARCH 1999 0740-7475/99/$10.00 © 1999 IEEE 59


.<br />

<strong>BIST</strong> CORE<br />

Table 1. <strong>Embedded</strong> <strong>DRAM</strong> test runs and required test support<br />

with and without <strong>BIST</strong>.<br />

per<strong>for</strong>m functional test, dc/ac parametric test, and redundancy<br />

analysis (<strong>for</strong> laser repair). A typical <strong>BIST</strong> design supports<br />

only functional test, but partial support of parametric<br />

test, redundancy analysis, and even self-repair is possible<br />

with increased logic overhead. As the table indicates, <strong>BIST</strong><br />

can include the entire pre-burn-in test. <strong>BIST</strong> also simplifies<br />

the implementation of a burn-in board (which generates<br />

burn-in test patterns). Final test requires a memory tester only<br />

<strong>for</strong> speed sorting, and <strong>BIST</strong> can handle functional test. Since<br />

memory testers are very expensive, the reduction of tester<br />

time justifies the use of <strong>BIST</strong>.<br />

Here, we present a <strong>BIST</strong> design and implementation <strong>for</strong><br />

embedded <strong>DRAM</strong>. It supports built-in self-diagnosis by feeding<br />

error in<strong>for</strong>mation to the external tester. Moreover, using<br />

a specific test sequence, it can test <strong>for</strong> critical timing faults,<br />

reducing tester time <strong>for</strong> ac parametric test. The design supports<br />

wafer test, pre-burn-in test, burn-in, and final test. It is<br />

field-programmable; the user can program test algorithms<br />

using predetermined test elements (such as march elements,<br />

surround test elements, and refresh modes). The user can<br />

optimize the hardware <strong>for</strong> a specific embedded <strong>DRAM</strong> with<br />

a set of predetermined test elements. Our design is different<br />

from the microprogram-controlled <strong>BIST</strong> described in<br />

Dreibelbis et al., 8 which has greater flexibility but higher<br />

overhead. Because our design begins at the register-transferlanguage<br />

level, test element insertion (<strong>for</strong> higher test coverage)<br />

and deletion (<strong>for</strong> lower hardware overhead) are<br />

relatively easy.<br />

Fault models<br />

We consider faults that may occur in the <strong>DRAM</strong> core’s address<br />

decoder, read/write circuitry, and memory cell array.<br />

We categorize address decoder faults (AFs) according to<br />

their functional behavior: 9,10<br />

■<br />

■<br />

■<br />

■<br />

Test run Isolation only Isolation and <strong>BIST</strong><br />

Wafer probe Tester Tester/<strong>BIST</strong><br />

Pre-burn-in test Tester <strong>BIST</strong><br />

Burn-in Burn-in board <strong>BIST</strong><br />

Final test Tester Tester/<strong>BIST</strong><br />

A certain address cannot access any cell.<br />

A certain address accesses multiple cells simultaneously.<br />

No address can access a certain cell.<br />

Multiple addresses can access a certain cell.<br />

Typical faults of the read/write circuitry (including buses,<br />

sense amplifiers, and write buffers) are equivalent to<br />

faults in the memory cell array. For memory cell array faults,<br />

we follow the notation used in van de Goor: 10<br />

■ ↑ —a rising cell transition (due to a write operation)<br />

■ ↓ —a falling cell transition<br />

■ b —either a rising or a falling cell transition<br />

■ ∀ —any operation at a cell<br />

■ —a fault in a cell, where S is the value or operation<br />

activating the fault, F is the cell’s faulty value, S ∈<br />

{0, 1, ↑, ↓, b}, and F ∈ {0, 1}<br />

■ —a fault involving m cells, where S 1 ,<br />

… , S m− 1 are the conditions of the first m−1 cells required<br />

to activate the fault in cell m, F is the faulty value of cell<br />

m, and <strong>for</strong> all 1 ≤ i ≤ m, S i ∈ {0, 1, ↑, ↓, b}<br />

■<br />

■<br />

■<br />

■<br />

■<br />

The following are typical faults in the memory cell array: 9,10<br />

Stuck-at fault (SAF)—a cell or line sticks at 1 or 0; <br />

denotes a stuck-at-1 fault and a stuck-at-0 fault.<br />

Stuck-open fault (SOF)—a cell is not accessible due to,<br />

<strong>for</strong> example, a broken line or a permanent open switch.<br />

Transition fault (TF)—a cell fails to make a transition; it<br />

can be or .<br />

Data retention fault (DRF)—a cell fails to retain its logic<br />

value after a prespecified period of time.<br />

Coupling faults:<br />

Inversion (CFin)—a transition in one cell inverts the<br />

content of another; that is, or .<br />

Idempotent (CFid)—a transition in one cell <strong>for</strong>ces a constant<br />

value (1 or 0) into another; that is, , , , or .<br />

State (CFst)—a coupled cell or line is <strong>for</strong>ced to a certain<br />

value only if the coupling cell or line is in a given<br />

state; that is, , , , or .<br />

The preceding single-cell fault models also apply to wordoriented<br />

memories. Coupling faults between cells in different<br />

words behave the same as in a bit-oriented memory. But<br />

coupling between cells inside the same word will virtually<br />

disappear if the write operation can override the coupling<br />

effect; that is, the write operation can correct the fault. In<br />

that case, the coupling fault can be detected only when its<br />

effect is stronger than the write operation. For example, say<br />

that a 4-bit word b 3 b 2 b 1 b 0 has a CFst , where b 3 couples<br />

b 2 . Then writing 0101 to b 3 b 2 b 1 b 0 results in a faulty value of<br />

0001 when CFst is stronger than the write operation; otherwise,<br />

the fault effect is masked.<br />

Test algorithms<br />

Our <strong>BIST</strong> scheme’s default test algorithms are the march al-<br />

60 IEEE DESIGN & TEST OF COMPUTERS


.<br />

gorithms. Table 2 shows a bitoriented<br />

March C − algorithm<br />

as an example. 10 M 0 , …, M 5<br />

denote the six march elements.<br />

In each march element,<br />

we first specify the<br />

address sequence: ⇑ means<br />

the address sequence is in ascending<br />

order, ⇓ means the<br />

address changes in descending<br />

order, and c<br />

means either ⇑ or ⇓ is acceptable.<br />

Consider M 1 , <strong>for</strong> example;<br />

the address sequence<br />

begins at the lowest address<br />

and changes in ascending order<br />

toward the highest. For<br />

each address (memory cell),<br />

the algorithm per<strong>for</strong>ms a<br />

read operation (with an expected<br />

0 in the fault-free<br />

case), writes back the complemented<br />

bit immediately,<br />

and then continues to the<br />

next address. The algorithm<br />

is also called the March 10N<br />

algorithm because it requires<br />

10N read/write operations,<br />

where N is the number of<br />

memory cells (address locations).<br />

The March C − completely<br />

detects SAFs, unlinked AFs,<br />

unlinked TFs, and CFs (including<br />

CFins, CFids, and<br />

CFsts). 10 It also detects SOFs<br />

if we extend M 1 to R0W1R1,<br />

or M 2 to R1W0R0. The resulting<br />

algorithm is called the<br />

Table 2. March C⎺ algorithm (R: read; W: write).<br />

extended March C − algorithm. Since our embedded <strong>DRAM</strong><br />

is word-oriented, we modify the 10N algorithm as ⇑(Wa);<br />

⇑(RaWa′); ⇑(Ra′Wa); ⇓(RaWa′); ⇓( Ra′Wa); ⇑(Ra), where a<br />

represents a data word (the background word) and a′ is its<br />

complement. This word-oriented algorithm reduces to the<br />

bit-oriented algorithm when a is a single bit.<br />

We select background words on the basis of the defined<br />

fault models and required fault coverage. Exhaustive data<br />

backgrounds are usually unaf<strong>for</strong>dable and unnecessary.<br />

Although the word-oriented March C − algorithm detects all<br />

the SAFs, unlinked AFs, TFs, and SOFs, coupling faults in the<br />

same word may not be detectable. The choice of data backgrounds<br />

determines the coverage of this kind of fault.<br />

March element M 0 M 1 M 2 M 3 M 4 M 5<br />

Address sequence c(W 0); ⇑(R0W 1); ⇑(R1W 0); ⇓(R 0W1); ⇓(R1W 0); c(R 0)<br />

Table 3. Three algorithms’ fault coverage (%): MATS++ (a); March X (b); March C⎺ (c).<br />

Fault P 1 P 2 P 3 P 2,3 P 1,2,3 P all<br />

SAF 100.0 100.0 100.0 100.0 100.0 100.0<br />

SOF 100.0 100.0 100.0 100.0 100.0 100.0<br />

TF 100.0 100.0 100.0 100.0 100.0 100.0<br />

AF 99.7 99.9 99.9 100.0 100.0 100.0<br />

CFin 100.0 100.0 100.0 100.0 100.0 100.0<br />

CFid 37.5 37.5 37.5 62.6 75.9 89.1<br />

CFst 50.0 50.0 50.0 75.0 87.5 100.0<br />

(a)<br />

SAF 100.0 100.0 100.0 100.0 100.0 100.0<br />

SOF 0.8 0.8 0.8 0.8 0.8 0.8<br />

TF 100.0 100.0 100.0 100.0 100.0 100.0<br />

AF 99.7 99.9 99.9 100.0 100.0 100.0<br />

CFin 100.0 100.0 100.0 100.0 100.0 100.0<br />

CFid 50.0 50.0 50.0 78.1 90.7 100.0<br />

CFst 62.5 62.5 62.5 84.4 93.0 100.0<br />

(b)<br />

SAF 100.0 100.0 100.0 100.0 100.0 100.0<br />

SOF 0.8 0.8 0.8 0.8 0.8 0.8<br />

TF 100.0 100.0 100.0 100.0 100.0 100.0<br />

AF 99.7 99.9 99.9 100.0 100.0 100.0<br />

CFin 100.0 100.0 100.0 100.0 100.0 100.0<br />

CFid 99.9 99.9 99.9 99.95 100.0 100.0<br />

CFst 99.9 99.9 99.9 99.95 100.0 100.0<br />

(c)<br />

Fault coverage evaluation<br />

We know a march algorithm’s coverage of its target faults<br />

by definition. However, to know its coverage of other faults<br />

requires further analysis. For example, the March X algorithm<br />

was designed to test all AFs, SAFs, TFs, and CFins, so<br />

its coverage of these faults is 100%. But to determine its coverage<br />

of CFids and CFsts, we must per<strong>for</strong>m analysis.<br />

Moreover, <strong>for</strong> a word-oriented memory such as we are discussing<br />

here, fault coverage also depends on the selected<br />

data backgrounds. Since there are so many possible faults<br />

and test algorithms (including address sequences,<br />

read/write operations, and data patterns/backgrounds), determining<br />

the algorithm that best balances cost and test cov-<br />

JANUARY–MARCH 1999 61


.<br />

<strong>BIST</strong> CORE<br />

Memory <strong>BIST</strong><br />

Q<br />

D<br />

18-bit<br />

address<br />

xRAS<br />

xCAS<br />

xWE<br />

Column address<br />

buffers<br />

Row address<br />

buffers<br />

Refresh<br />

controller<br />

Timing<br />

controller<br />

erage is difficult.<br />

We group faults into two classes: single-cell faults and faults<br />

involving two cells (such as coupling faults). We can test <strong>for</strong><br />

single-cell faults such as SAFs with an algorithm using any<br />

single data background because it tests all cells in the same<br />

way as <strong>for</strong> a bit-oriented memory. Two-cell faults, however,<br />

depend on the strength of the write operation and the coupling<br />

effect. Suppose the write operation erases the coupling<br />

effect between two cells in the same word. Such faults are<br />

redundant, and we must consider only coupling between<br />

two different words, so one background is sufficient.<br />

However, if the coupling effect is stronger than the write operation,<br />

we must consider coupling faults inside a word. This<br />

is the assumption in the following analysis.<br />

We can derive fault coverage by manual analysis, but that<br />

is tedious and sometimes impractical <strong>for</strong> complex test algorithms<br />

and fault models. Instead, we implemented a memory<br />

fault simulator called RAMSES (RAM Simulator <strong>for</strong> Error<br />

Screening) <strong>for</strong> this purpose. For a word-oriented memory<br />

with 4-bit words, the data backgrounds (patterns) commonly<br />

used are 0000 (P 1 ), 0101 (P 2 ), and 0011 (P 3 ). To make the list<br />

complete, we also consider 0110 (P 4 ), 0001 (P 5 ), 0010 (P 6 ),<br />

0100 (P 7 ), and 1000 (P 8 ). We simulated several test algorithms<br />

with RAMSES, assuming a 1-Kbyte word-oriented embedded<br />

<strong>DRAM</strong> with 4-bit words. Table 3 shows the fault<br />

coverage simulation results <strong>for</strong> the algorithms, in which P 2,3<br />

stands <strong>for</strong> {P 2 , P 3 }, P 1,2.3 <strong>for</strong> {P 1 , P 2 , P 3 }, and P all <strong>for</strong> {P 1 , P 2 , …,<br />

P 8 }. We show only the results <strong>for</strong> some data backgrounds,<br />

16<br />

16<br />

8<br />

10<br />

Row decoder<br />

1024<br />

Figure 1. <strong>Embedded</strong> EDO <strong>DRAM</strong> connected to <strong>BIST</strong> circuitry.<br />

Data out registers<br />

Data in registers<br />

Column decoder<br />

256<br />

Sense amplifiers<br />

1-Mbit<br />

memory<br />

array<br />

16<br />

16<br />

4-Mbit embedded<br />

EDO <strong>DRAM</strong><br />

although we per<strong>for</strong>med extensive<br />

simulations. We<br />

found that, in general, P 2<br />

provides the highest fault<br />

coverage among single<br />

backgrounds, and P 2,3 is the<br />

best among double backgrounds.<br />

For triple backgrounds,<br />

P 1,2,3 provides the<br />

highest fault coverage.<br />

Intuitively, uni<strong>for</strong>mity is not<br />

desirable so far as testing is<br />

concerned.<br />

For <strong>DRAM</strong>s, we may have<br />

to consider additional faults,<br />

such as neighborhood-pattern-sensitive<br />

and linked<br />

faults. If such faults are to be<br />

targeted after failure analysis,<br />

we need simulation to select<br />

the best test algorithms<br />

<strong>for</strong> them.<br />

The simulation results<br />

show that using multiple data<br />

backgrounds significantly increases the coverage of coupling<br />

faults <strong>for</strong> MATS++ 10 and March X compared with using a single<br />

background. However, <strong>for</strong> March C − , the improvement is<br />

minor—with only a background P 1 , the algorithm covers most<br />

of the faults. (The SOF fault coverage in Table 3c will reach<br />

100% if M 1 is extended to RaWa′Ra′.) Using an additional<br />

background doubles the test time but detects only a small<br />

percentage of additional faults (intraword coupling faults).<br />

Also, <strong>for</strong> larger <strong>DRAM</strong>s (with the same word length), the fault<br />

coverage of March C − does not decrease; rather, it increases<br />

since the percentage of undetected faults decreases.<br />

<strong>DRAM</strong> specification and <strong>BIST</strong> design strategy<br />

Since extended data-out (EDO) <strong>DRAM</strong>s are common, we<br />

use a 1-Mbit × 4 EDO <strong>DRAM</strong> as our example <strong>for</strong> explaining<br />

the proposed <strong>BIST</strong> design. Of course, one can easily apply<br />

the scheme to other embedded <strong>DRAM</strong> architectures. Our<br />

design assumes the embedded <strong>DRAM</strong> has four memory<br />

banks, each organized as a 1-Mbit array; thus, it has 256K addressable<br />

locations, each containing four bits. Figure 1<br />

shows a block diagram of the embedded <strong>DRAM</strong> with the<br />

proposed <strong>BIST</strong> scheme (detailed later). The timing controller<br />

controls the address buffers, data I/O buffers, and refresh<br />

mechanism via signals xRAS, xCAS, and xWE, which<br />

represent row address strobe, column address strobe, and<br />

write enable. <strong>Embedded</strong> <strong>DRAM</strong>s normally use separate I/O<br />

channels instead of multiplexed pins as in commodity<br />

<strong>DRAM</strong>s, so row and column addresses and data input (D)<br />

62 IEEE DESIGN & TEST OF COMPUTERS


.<br />

and output (Q) channels are<br />

all separate.<br />

One of the challenges of<br />

memory <strong>BIST</strong> is that the<br />

asynchronous memory core<br />

(traditional RAMs are asynchronous)<br />

must be tested by<br />

the synchronous <strong>BIST</strong> logic.<br />

This is especially difficult in<br />

embedded <strong>DRAM</strong>s. To illustrate<br />

how we cope with this<br />

problem, we use typical<br />

EDO <strong>DRAM</strong> timing specifications.<br />

The proposed strategy<br />

is not limited to the<br />

given EDO <strong>DRAM</strong> architecture.<br />

Figure 2 shows the<br />

typical EDO page-mode<br />

read-write cycle. Although in<br />

this case D and Q share the<br />

same I/O channel, our strategy<br />

still works (timing control<br />

of separate I/O channels<br />

is in fact easier). Table 4 lists<br />

the values of the timing parameters<br />

shown in Figure 2.<br />

The EDO page mode’s timing<br />

depends mainly on the<br />

edges of the four signals<br />

xRAS, xCAS, xWE, and xOE.<br />

They determine the time to<br />

latch the row address, column<br />

address, and input data<br />

<strong>for</strong> the memory core, as well<br />

as the output data <strong>for</strong> use by<br />

other cores.<br />

For embedded <strong>DRAM</strong>,<br />

which has no pin count limitation,<br />

D and Q can be separate<br />

to simplify control;<br />

there<strong>for</strong>e, output enable signal<br />

xOE can be removed<br />

without affecting functionality.<br />

The <strong>BIST</strong> logic, however,<br />

still needs xOE to indicate the arrival of the output data.<br />

We must determine an appropriate <strong>BIST</strong> clock period<br />

based on the xCAS cycle time (period) of the EDO page<br />

mode, t CAS in Table 4. In our example, the minimum t CAS is<br />

10 ns, which we use as the basis of the test clock period. We<br />

can select a test clock of up to 100 MHz.<br />

Instead of a memory tester, we need only a simple logic<br />

tester, which is slower and less expensive, to activate the<br />

Table 4. Timing parameter values of EDO page-mode read-write cycle.<br />

Parameter Min (ns) Max (ns) Description<br />

t AA 25 Access time from column address<br />

t ASC 0 Column address setup time<br />

t ASR 0 Row address setup time<br />

t AWD 42 Column-address-to-xWE delay<br />

t CAC 13 Access time from xCAS<br />

t CAH 10 Column address hold time<br />

t CAS 10 10,000 xCAS active pulsewidth<br />

t CP 10 xCAS precharge pulsewidth<br />

t CWD 28 xCAS-to-xWE delay<br />

t DH 10 D hold time<br />

t DS 0 D setup time<br />

t OD 0 12 Output disable<br />

t OEA 12 Access time from xOE<br />

t RAC 50 Access time from xRAS<br />

t RASP 55 125,000 xRAS (EDO page-mode) pulsewidth<br />

t RCD 12 xRAS-to-xCAS delay<br />

t RAH 10 Row address hold time<br />

t RP 30 xRAS precharge pulsewidth<br />

t WP 5 Write pulsewidth<br />

xRAS<br />

xCAS<br />

Addr<br />

xWE<br />

DQ<br />

xOE<br />

t RP<br />

t RCD<br />

t RASP<br />

t CAS t CP<br />

t<br />

t ASC<br />

t RAH<br />

ASR t CAH<br />

Row Column Column Column<br />

t CWD t WP<br />

t AWD<br />

t AA<br />

t CAC<br />

t DS<br />

t RAC<br />

t OEA<br />

Q D Q D<br />

t OD<br />

Figure 2. Timing diagram of EDO page-mode read-write cycle.<br />

<strong>BIST</strong> logic and receive the test result. The <strong>BIST</strong> sequencer<br />

(a timing sequence generator) generates timing signals<br />

based on the clock period; that is, it converts a long or short<br />

timing signal duration to a certain number of clock periods.<br />

Once the signal is converted, it is fixed in the <strong>BIST</strong> design;<br />

there<strong>for</strong>e, we must determine the clock period with care to<br />

avoid violation of the embedded <strong>DRAM</strong>’s timing specifications.<br />

In our example, we assume the clock period (and the<br />

t DH<br />

Q<br />

D<br />

JANUARY–MARCH 1999 63


.<br />

<strong>BIST</strong> CORE<br />

xCAS period) to be 20 ns (though it can be reduced almost<br />

to 10 ns). Once the clock period is fixed, we determine the<br />

other two related timing parameters, xRAS and xWE, accordingly.<br />

Also, we shift and stretch the address, D, and Q signals<br />

in the original timing diagram (Figure 2) according to<br />

the following rules. Note that Table 4 specifies t RAC , <strong>for</strong> example,<br />

as no more than 50 ns. This means the embedded<br />

<strong>DRAM</strong> design guarantees that Q is available 50 ns after the<br />

falling transition of xRAS (see Figure 2); thus, the sampling<br />

of Q should take place at least 50 ns after xRAS.<br />

■<br />

■<br />

The row address must be ready be<strong>for</strong>e xRAS goes low,<br />

and the column address must be ready be<strong>for</strong>e xCAS<br />

goes low. The time the address is stable be<strong>for</strong>e the address<br />

strobe is usually more than one clock cycle, meeting<br />

our design’s 0-ns setup time requirement. Also, the<br />

address remains stable <strong>for</strong> more than one clock cycle,<br />

meeting the 10-ns hold time requirement.<br />

The timing requirement <strong>for</strong> input data D is the same as<br />

<strong>for</strong> the column address.<br />

■ The major parameters related to output data Q are t AA ,<br />

t RAC , and t CAC , which are 25 ns (max), 50 ns (max), and<br />

13 ns (max). The xCAS low period (t CAS ) should span at<br />

least two clock cycles, since Q will settle at the beginning<br />

of the second cycle, and the clock cycle (20 ns) is<br />

longer than 13 ns. Since the column address is ready<br />

one clock cycle be<strong>for</strong>e the transition of xCAS (see the<br />

first rule), we let the time from xCAS to Q be two clock<br />

cycles to satisfy the t RAC constraint. Finally, the first falling<br />

transition of xCAS in page mode is delayed <strong>for</strong> one more<br />

clock cycle, so there are at least three clock cycles from<br />

xRAS to Q, satisfying the t RAC specification.<br />

■<br />

For the write operation (as in the page-mode read-write<br />

cycle), the key parameters are t AWD and t CWD . Because Q<br />

is sampled at the second clock after xCAS goes low, one<br />

BCK<br />

xRAS<br />

xCAS<br />

ADDR 3ff 0ff 0fe 0fd<br />

xWE<br />

Data in<br />

Q out aaaa aaaa aaaa<br />

xOE<br />

Figure 3. A timing diagram generated by the sequencer.<br />

5555 5555 5555<br />

more clock cycle must be inserted into the low period<br />

of xCAS, making it at least three cycles.<br />

Following these rules, the sequencer can generate wave<strong>for</strong>ms<br />

of the critical timing parameters to meet the specification.<br />

Slight adjustments may be necessary <strong>for</strong> other timing<br />

parameters. Figure 3 shows the wave<strong>for</strong>m diagram of RaWa′<br />

generated by the sequencer according to the rules and plotted<br />

by a timing simulator. The sequencer can also generate<br />

wave<strong>for</strong>ms <strong>for</strong> other march elements and retention and refresh<br />

test elements according to similar rules.<br />

<strong>BIST</strong> architecture and function<br />

Figure 4 diagrams our <strong>BIST</strong> design and the interface between<br />

the <strong>BIST</strong> logic and the embedded <strong>DRAM</strong>. The <strong>BIST</strong> activation<br />

control (BAC) input activates the <strong>BIST</strong> logic; the<br />

embedded <strong>DRAM</strong> is in normal mode when BAC is 0 and in<br />

<strong>BIST</strong> mode when BAC is 1. The <strong>BIST</strong> controller is a finite-state<br />

machine; its state transition is controlled by the <strong>BIST</strong> control<br />

selection (BCS) input. The <strong>BIST</strong> controller also controls the<br />

scan chains, shifting in test patterns and commands from the<br />

<strong>BIST</strong> scan-in (BSI) input and shifting out results from the <strong>BIST</strong><br />

scan-out (BSO) output. As Figure 4 shows, the controller contains<br />

multiple chains. The decode logic and test mode selection<br />

modules determine the proper data register to scan in<br />

the test commands and subsequently activate the sequencer.<br />

The sequencer generates the <strong>DRAM</strong>’s timing sequence, with<br />

the help of some built-in counters and the timing generator.<br />

The comparator compares and reports any discrepancy between<br />

the output data from the <strong>DRAM</strong> and the original input<br />

data generated by the sequence controller.<br />

The <strong>BIST</strong> logic has three additional I/O signals. The <strong>BIST</strong><br />

ready flag (BRD*) indicates when the <strong>BIST</strong> sequence is finished,<br />

so that the go/no-go indicator signal (BGO) can be<br />

sampled to check that the embedded <strong>DRAM</strong> is functioning<br />

correctly. The ⎺B⎺R⎺S*/<br />

SCAN signal acts as both the<br />

reset and scan test control.<br />

All registers in the <strong>BIST</strong> controller<br />

finite-state machine<br />

are scanned, and be<strong>for</strong>e we<br />

use the <strong>BIST</strong> logic to test the<br />

<strong>DRAM</strong>, the logic itself is scan<br />

tested. Finally, we need a<br />

<strong>BIST</strong> clock (BCK) input.<br />

BCK and BAC must be<br />

dedicated; others cannot<br />

share these two input pins<br />

(<strong>for</strong> example, by using multiplexers).<br />

But BRD* is optional<br />

and may be removed<br />

if pin count is a concern. In<br />

64 IEEE DESIGN & TEST OF COMPUTERS


.<br />

BGO<br />

BRD*<br />

Sequencer<br />

xOB<br />

Data background<br />

composer<br />

Comparator<br />

Q<br />

D<br />

16<br />

16<br />

Row address counter<br />

Column address counter<br />

BSO<br />

Sequence<br />

controller<br />

Control<br />

counter<br />

Timing<br />

generator<br />

<strong>DRAM</strong> interface buffers<br />

18-bit<br />

address<br />

4-Mbit<br />

embedded<br />

EDO<br />

<strong>DRAM</strong><br />

<strong>BIST</strong> scan path<br />

Burn-in commands<br />

BSI<br />

March commands/data<br />

xRAS<br />

Diagnosis in<strong>for</strong>mation<br />

xCAS<br />

BCK<br />

Test mode selection<br />

xWE<br />

Decode logic<br />

BCS<br />

BAC<br />

BRS*/SCAN<br />

Controller<br />

<strong>BIST</strong> controller<br />

Memory <strong>BIST</strong><br />

Figure 4. Block diagram of the proposed <strong>BIST</strong> design connected to the embedded EDO <strong>DRAM</strong>.<br />

that case, we can encode BGO to signal the completion of<br />

the <strong>BIST</strong> sequence and show the test result. The reset<br />

(⎺B⎺R⎺S*) also is optional, since a short synchronizing sequence<br />

<strong>for</strong> the <strong>BIST</strong> controller can be the reset sequence.<br />

However, the SCAN pin is still required in that case. Apart<br />

from BCK and BAC, all other <strong>BIST</strong> I/O signals can share pins<br />

with signals outside the <strong>DRAM</strong> core; thus, we can use multiplexed<br />

pins to reduce pin overhead.<br />

The proposed <strong>BIST</strong> supports the following test modes:<br />

■<br />

Scan test—testing the <strong>BIST</strong> logic, except the <strong>BIST</strong> controller<br />

finite-state machine. We execute scan test at the<br />

beginning of the <strong>BIST</strong> sequence to ensure the circuit’s<br />

■<br />

■<br />

correct functionality. In addition, we test all registers in<br />

the <strong>DRAM</strong> core in this mode.<br />

Memory <strong>BIST</strong>—functional testing of the <strong>DRAM</strong> using<br />

march algorithms. This mode exercises various operation<br />

modes, such as non-page-mode test, page mode<br />

test, refresh test, and retention test. This mode also supports<br />

diagnosis. In that case, the <strong>BIST</strong> logic can shift out<br />

the address of any faulty cell, column, or row to the external<br />

tester via the scan mechanism. We can test <strong>for</strong> retention<br />

faults in this mode or in a separate test mode.<br />

Burn-in—stress testing to screen out unreliable parts that<br />

may fail in infancy. This mode uses the <strong>BIST</strong> logic to exercise<br />

the entire memory cell array in a more efficient<br />

JANUARY–MARCH 1999 65


.<br />

<strong>BIST</strong> CORE<br />

■<br />

0<br />

0<br />

0<br />

BCS = 0<br />

Initial<br />

1<br />

Test_mode_in<br />

0<br />

Decode<br />

1<br />

Data_in_out<br />

0<br />

Apply<br />

1<br />

Execute<br />

Exit<br />

0<br />

1<br />

Probe/pause<br />

Figure 5. <strong>BIST</strong> controller state diagram.<br />

0<br />

1<br />

1<br />

1<br />

1<br />

Initial/reset state: all <strong>BIST</strong> outputs<br />

retain safe values<br />

Test mode selection<br />

Command decoding<br />

Data scan: shift in test inputs and<br />

shift out results<br />

Scan test application and<br />

<strong>BIST</strong> activation<br />

Memory function test,<br />

BI, AC test, etc.<br />

Pause <strong>for</strong> observation, or exit the<br />

execution phase<br />

Shift out results,<br />

or pause <strong>for</strong> retention test<br />

method than the normal read/write access. The default<br />

burn-in test is to use a march algorithm supported in the<br />

memory <strong>BIST</strong> mode.<br />

Timing-fault test—testing <strong>for</strong> critical timing faults by running<br />

the <strong>BIST</strong> clock at an appropriate speed. Among<br />

these faults are incorrect setup time, hold time, and data<br />

arrival time of various control and data signals. We can<br />

simultaneously detect some timing faults, such as incorrect<br />

setup time and hold time, when we per<strong>for</strong>m<br />

functional test (in memory <strong>BIST</strong> mode). We can test <strong>for</strong><br />

others by using different <strong>BIST</strong> clock periods or an external<br />

memory tester.<br />

Our design can include other test modes if necessary,<br />

since the control scheme is flexible. Of course, <strong>for</strong> dc parameter<br />

test, we must still use an external tester’s parameter<br />

measurement unit.<br />

<strong>BIST</strong> implementation<br />

As shown in Figure 4, the <strong>BIST</strong> logic consists of two parts:<br />

controller and sequencer. The controller takes charge of the<br />

overall <strong>BIST</strong> flow, and the sequencer generates the embedded<br />

<strong>DRAM</strong>’s address, data, and timing sequences. At the<br />

ASIC level, logic <strong>BIST</strong> and memory <strong>BIST</strong> can share the same<br />

controller, and the on-chip processor can function as the sequencer<br />

during memory <strong>BIST</strong> mode. However, <strong>for</strong> a <strong>DRAM</strong><br />

core delivered as intellectual property to be embedded in<br />

various chips, we must integrate a complete <strong>BIST</strong> circuit with<br />

the <strong>DRAM</strong> core. We consider the latter case.<br />

Controller. After the scan test mode has finished successfully,<br />

we enter the memory <strong>BIST</strong> mode. The <strong>BIST</strong> controller<br />

finite-state machine controls the scan test and <strong>BIST</strong><br />

flow to test the rest of the <strong>BIST</strong> circuitry and the embedded<br />

<strong>DRAM</strong>. Figure 5 shows the state diagram of the proposed finite-state<br />

machine. Each arc in the figure represents a state<br />

transition controlled by BCS. We enter the initial state by asserting<br />

⎺B⎺R⎺S*/SCAN low or applying a synchronizing sequence.<br />

By applying four continuous 0’s on BCS, we can<br />

return to the initial state from any other state.<br />

From the initial state, we enter the test_mode_in state if<br />

BCS is 1. In this state, we select the intended test mode. The<br />

decode state generates all internal control signals, including<br />

those <strong>for</strong> the selection of the proper scan chain <strong>for</strong> the<br />

data sequence to be shifted in. User-specified parameters<br />

and the test algorithm are shifted in during the data_in_out<br />

state. Note that the decode, data_in_out, and apply states<br />

<strong>for</strong>m a loop <strong>for</strong> running the scan test. We per<strong>for</strong>m other test<br />

modes in the execute state. For memory core testing and diagnosis,<br />

we enter the bottom loop, which contains the execute,<br />

exit, and probe/pause states, and collect the error<br />

in<strong>for</strong>mation in the probe/pause state. We can also run retention<br />

test in the probe/pause state, which allows pausing<br />

<strong>for</strong> a user-determined interval.<br />

An alternative approach is to add an extra mode in the sequencer,<br />

using a counter <strong>for</strong> measuring the time interval from,<br />

<strong>for</strong> example, xCAS to xWE. We can derive appropriate timing<br />

sequences using similar rules as <strong>for</strong> march tests. When diagnosis<br />

is required, the sequencer tests the entire memory core;<br />

in other words, the process does not stop immediately when<br />

an error is detected. It is not necessary to continue the testing<br />

process when an error is found if we per<strong>for</strong>m testing but not<br />

diagnosis. The sequencer will simply halt and indicate that<br />

an error is found, and the controller can go back to the decode<br />

state through the exit state. From there, either we can<br />

reach the initial state, or we can reenter the data_in_out state<br />

again. The apply, execute, exit, and probe/pause states can be<br />

merged if diagnosis is not required.<br />

Figure 6 shows the <strong>BIST</strong> circuit’s timing diagram (the entire<br />

control sequence). As discussed earlier, when BAC is 1,<br />

66 IEEE DESIGN & TEST OF COMPUTERS


.<br />

the <strong>DRAM</strong> enters the memory<br />

<strong>BIST</strong> mode, in which every<br />

BCK<br />

signal is synchronized to BAC Normal mode <strong>BIST</strong> mode<br />

<strong>BIST</strong> clock BCK. As depicted<br />

in the figure,⎺B⎺R⎺S*/SCAN is BRS*/SCAN<br />

Controller test<br />

pulled high at the beginning<br />

of the memory <strong>BIST</strong> mode to BCS<br />

Scan test control<br />

per<strong>for</strong>m the scan test verifying<br />

the <strong>BIST</strong> controller’s correctness.<br />

A scan chain <strong>for</strong>ms<br />

BRD*<br />

between BSI and BSO <strong>for</strong> applying<br />

patterns and collect-<br />

BGO<br />

BSI<br />

Test patterns<br />

ing responses in this phase.<br />

After the scan,⎺B⎺R⎺S*/SCAN is<br />

BSO<br />

Test outputs<br />

pulled low to reset the <strong>BIST</strong><br />

controller (BCS remains low Figure 6. <strong>BIST</strong> circuit control sequence.<br />

to generate the reset sequence<br />

if necessary). The<br />

<strong>BIST</strong> controller then per<strong>for</strong>ms a scan test <strong>for</strong> the rest of the<br />

<strong>BIST</strong> circuit. The test algorithm is subsequently applied according<br />

to the control flow discussed earlier and the finitestate<br />

machine shown in Figure 5. Finally, after BRD* is<br />

asserted high and BGO is sampled, we let BAC equal 0 to return<br />

the <strong>DRAM</strong> to normal mode.<br />

In the controller, we implemented several default<br />

read/write commands, address orders, data backgrounds,<br />

and EDO <strong>DRAM</strong> access modes. The built-in read/write commands<br />

are Ra (read the expected word a), Wa (write word<br />

a), RaWa′ (read word a, complement, and then write back<br />

immediately), and RaWa′Ra′. The default address orders include<br />

⇑ and ⇓, implemented by an up-down counter. The<br />

built-in access modes to be used in conjunction with the address<br />

orders are row scan, column scan, page-mode column<br />

scan, and refresh. The data background word (a) is supplied<br />

online. Each march element is a combination of the appropriate<br />

read/write command, address order, access mode,<br />

and data background.<br />

In addition to the march commands, our <strong>BIST</strong> also supports<br />

diagnosis, burn-in, and retention test. Other test commands<br />

can be integrated easily. In our scheme, a test<br />

algorithm is a sequence of commands entered from the BSI<br />

pin to the scan chains, decoded, and executed (see Figure<br />

4). When the controller encounters a special end-ofalgorithm<br />

command, it detects the end of a test algorithm.<br />

In our default implementation, most march algorithms can<br />

be programmed, including the extended March C − , March<br />

X, March Y, MATS++, and others.<br />

Sequencer. In designing the sequencer, our major goal was<br />

flexibility. Our sequencer design can be used <strong>for</strong> a wide range of<br />

<strong>DRAM</strong> cores with various operation modes, memory dimensions,<br />

and timing specifications. Figure 7 (next page) shows the state<br />

Reset<br />

sequence<br />

Scan test<br />

Scan in<br />

Scan out<br />

Commands/data<br />

<strong>BIST</strong> control sequence<br />

Go/No-Go<br />

Observe<br />

diagram of the sequence controller finite-state machine (see<br />

Figure 4). As the state diagram shows, we implemented timing<br />

sequence generation modules <strong>for</strong> the single read/write commands<br />

and the page-mode read/write commands <strong>for</strong> the march<br />

elements defined in the controller. We also implemented a refresh<br />

timing generation module <strong>for</strong> refresh tests. The figure shows<br />

the sequence controller’s default implementation, which is designed<br />

<strong>for</strong> march tests, but we can easily extend it to other algorithms.<br />

An important concern is that the sequencer’s outputs be<br />

glitch-free, and that they be in high impedance when <strong>BIST</strong> is not<br />

in use—that is, in normal operation mode. Our implementation<br />

takes these requirements into consideration. In Figure 7, the state<br />

transition is on BCK’s rising edge, while the control (timing) signals<br />

<strong>for</strong> the <strong>DRAM</strong> core are applied on BCK’s falling edge.<br />

Consequently, the sequencer’s outputs are guaranteed glitch-free.<br />

When the embedded <strong>DRAM</strong> is in normal mode, the sequence<br />

controller is in the idle state, where it stays until the<br />

<strong>BIST</strong> controller enters the execute state. Then the sequence<br />

controller fetches the march commands, enters the reset<br />

state, and carries out the sequence <strong>for</strong> the specified memory<br />

access mode. For memory access modes such as Ra, Wa,<br />

RaWa′, and RaWa′Ra′, the timing wave<strong>for</strong>m is periodic, and<br />

the period depends on the row access cycle. We inserted<br />

appropriate refresh cycles <strong>for</strong> refresh timing. The page-mode<br />

access cycle consists of the row access and column access<br />

cycles. The <strong>DRAM</strong> core latches the row address first; then it<br />

latches the column addresses of the whole page one by one.<br />

In the self-refresh/hidden-refresh/RAS-only-refresh state, the<br />

embedded <strong>DRAM</strong> is tested <strong>for</strong> its refresh mechanism.<br />

As shown in Figure 4, the sequencer design is based on<br />

counters. If the memory size increases, only the lengths of<br />

the row address and column address counters and the size<br />

of the comparator increase. Only one additional bit is required<br />

<strong>for</strong> an address counter when memory size doubles, so<br />

JANUARY–MARCH 1999 67


.<br />

<strong>BIST</strong> CORE<br />

Reset<br />

Idle<br />

Refresh<br />

Ra<br />

Wa<br />

RaWa′<br />

RaWa′Ra′<br />

Page-mode<br />

row<br />

Page-mode<br />

column Ra<br />

Page-mode<br />

row<br />

Page-mode<br />

column Wa<br />

Page-mode<br />

row<br />

Page-mode<br />

column<br />

RaWa′<br />

Page-mode<br />

row<br />

Page-mode<br />

column<br />

RaWa′Ra′<br />

Selfrefresh/<br />

hidden<br />

refresh/<br />

RASonly<br />

refresh<br />

Done<br />

Figure 7. State diagram of the sequence controller <strong>for</strong> march tests.<br />

Table 5. Comparison of embedded-<strong>DRAM</strong> test application methodologies.<br />

Test method Test time Hardware cost Coverage<br />

Ours Short Low Functional, timing, burn-in<br />

Processor-based Short Low Functional<br />

Scan Long Low Functional<br />

Tester Short Very high Functional, timing, dc<br />

hardware overhead is low. The control counter, designed to<br />

meet the refresh time specification, is used <strong>for</strong> retention/refresh<br />

test. Refresh time specifications <strong>for</strong> various <strong>DRAM</strong>s currently<br />

in use do not differ much regardless of size, so the<br />

sequencer’s area overhead actually drops when the <strong>DRAM</strong><br />

core’s size increases. In our example, a 21-bit counter suffices<br />

if the refresh cycle does not exceed 32 ms. The size of<br />

the entire <strong>BIST</strong> logic <strong>for</strong> the EDO <strong>DRAM</strong> core, without burnin<br />

and redundancy analysis, is about 2,000 to 3,000 gates.<br />

Following a command decoded by the controller, the sequencer<br />

generates the <strong>DRAM</strong> core’s required output signals,<br />

using a small look-up table (LUT). The LUT-based design reduces<br />

design ef<strong>for</strong>t and hardware cost by allowing new test<br />

commands to be added easily. When the timing specifications<br />

change, a simple program generates the LUT content<br />

automatically. The LUT-based design is an important step<br />

toward a <strong>BIST</strong> compiler <strong>for</strong> embedded <strong>DRAM</strong>s. It is configurable<br />

at the register-transfer-language level but does not<br />

modify the architecture. For non-march algorithms such as<br />

pseudorandom and surround test, one must design specific<br />

address counters or counter configurations <strong>for</strong> the sequencer<br />

and add new commands to the state diagram.<br />

Discussion<br />

We used a commercial synthesis tool and a single-poly,<br />

triple-metal logic cell library to estimate our <strong>BIST</strong> circuit’s area.<br />

Figure 8 plots <strong>BIST</strong> area overhead with respect to various<br />

<strong>DRAM</strong> core sizes. The <strong>DRAM</strong> areas are based on areas of existing<br />

0.25-µm and 0.35-µm EDO <strong>DRAM</strong> chips reported by major<br />

vendors. Comparisons are based on the areas estimated<br />

by the synthesis tool, so they are not exact. Also, it is impossible<br />

(and unnecessary) <strong>for</strong> us to project the precise size of<br />

our <strong>BIST</strong> circuit on all these <strong>DRAM</strong> chips. Since the overhead<br />

is very low, we expect the exact area overhead numbers to<br />

be close to those shown in the figure. Thus, the <strong>BIST</strong> area overhead<br />

<strong>for</strong> our default <strong>BIST</strong> design is about 1.3% <strong>for</strong> a 1-Mbit embedded<br />

<strong>DRAM</strong> and negligible <strong>for</strong> a 64-Mbit version. Even <strong>for</strong><br />

the 16-Mbit <strong>DRAM</strong>, the most popular embedded <strong>DRAM</strong> candidate<br />

currently, the <strong>BIST</strong> area overhead is less than 0.3%.<br />

Clearly, the larger the <strong>DRAM</strong> core, the smaller the <strong>BIST</strong> area<br />

overhead.<br />

Since the area overhead is<br />

low, one can include more<br />

test modes and algorithms to<br />

increase coverage, as long<br />

as test time does not become<br />

a problem. The test<br />

time <strong>for</strong> non-page-mode<br />

March C − , <strong>for</strong> example, is<br />

about 0.4 seconds <strong>for</strong> the 4-<br />

68 IEEE DESIGN & TEST OF COMPUTERS


.<br />

Memory area (µm 2 )<br />

140<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

Memory area<br />

<strong>BIST</strong> overhead<br />

0 0<br />

1 4 16 64<br />

Memory size (Mbits)<br />

1.4<br />

1.2<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

<strong>BIST</strong> area overhead (%)<br />

makes designing and implementing appropriate <strong>BIST</strong> circuits<br />

<strong>for</strong> various embedded <strong>DRAM</strong>s systematic and easy.<br />

This <strong>BIST</strong> design has been implemented in an industry project<br />

in which we plan to evaluate its effectiveness in the future.<br />

Meanwhile, under demand from industry, we are working on<br />

an extended version that will incorporate built-in redundancy<br />

analysis and support more fault models.<br />

Acknowledgments<br />

Global UniChip Corporation (GUC), Hsinchu, Taiwan, under<br />

contract NTHU-0987-113J6, partly supported this work.<br />

Figure 8. Area overhead of the <strong>BIST</strong> core.<br />

Mbit core (assuming a 50-MHz clock). It increases approximately<br />

in proportion to the address space. To reduce test<br />

time, one can explore parallel testing of multiple banks or<br />

even multiple words by separate <strong>BIST</strong> sequencers, but that<br />

requires very careful modification of the memory core. Note<br />

that after dicing, an external memory tester is not needed<br />

until after burn-in—an important <strong>BIST</strong> benefit.<br />

Table 5 qualitatively summarizes some embedded <strong>DRAM</strong><br />

test application methods, including our <strong>BIST</strong> implementation,<br />

processor-based <strong>BIST</strong>, scan-based serial testing, 4 and direct<br />

memory tester access. Our method is very suitable <strong>for</strong> embedded<br />

<strong>DRAM</strong>, especially when the <strong>BIST</strong> circuitry is to be integrated<br />

with the <strong>DRAM</strong> core to <strong>for</strong>m a single piece of<br />

intellectual property <strong>for</strong> use in various ASICs. Moreover, by<br />

properly configuring the controller and sequencer, we can<br />

support timing-fault testing and burn-in. Processor-based <strong>BIST</strong><br />

is also popular in an ASIC environment, where an existing<br />

processor is available <strong>for</strong> the <strong>DRAM</strong> <strong>BIST</strong> designer. But it is<br />

not suitable <strong>for</strong> an embedded <strong>DRAM</strong> designed as a standalone,<br />

reusable intellectual property core. Serial <strong>BIST</strong> is not<br />

popular because it is slow. Also, the second and third methods<br />

normally support only functional test. The external memory<br />

tester is the most powerful but most expensive method. It<br />

supports all kinds of test (except burn-in) with very high resolution<br />

and programmability—and a very high cost.<br />

References<br />

1. D. Patterson et al., “A Case <strong>for</strong> Intelligent RAM,” IEEE Micro,<br />

Vol. 17, No. 2, Mar.-Apr. 1997, pp. 34-44.<br />

2. “A D&T Roundtable: Testing Mixed Logic and <strong>DRAM</strong> Chips,”<br />

IEEE Design & Test of Computers, Vol. 15, No. 2, Apr.-June 1998,<br />

pp. 86-92.<br />

3. R. Dekker, F. Beenker, and L. Thijssen, “A Realistic Self-Test<br />

Machine <strong>for</strong> Static Random Access Memories,” Proc. Int’l Test<br />

Conf., IEEE Computer Society Press, Los Alamitos, Calif., 1988,<br />

pp. 353-361.<br />

4. B. Nadeau-Dostie, A. Silburt, and V.K. Agarwal, “Serial Interface<br />

<strong>for</strong> <strong>Embedded</strong>-Memory Testing,” IEEE Design & Test of<br />

Computers, Vol. 7, No. 2, Apr. 1990, pp. 52-63.<br />

5. R. Treuer and V.K. Agarwal, “Built-In Self-Diagnosis <strong>for</strong> Repairable<br />

<strong>Embedded</strong> RAMs,” IEEE Design & Test of Computers,<br />

Vol. 10, No. 2, June 1993, pp. 24-33.<br />

6. P. Camurati et al., “Industrial <strong>BIST</strong> of <strong>Embedded</strong> RAMs,” IEEE<br />

Design & Test of Computers, Vol. 12, No. 3, Fall 1995, pp. 86-95.<br />

7. S. Tanoi et al., “On-Wafer <strong>BIST</strong> of a 200-Gb/s Failed-Bit Search<br />

<strong>for</strong> 1-Gb <strong>DRAM</strong>,” IEEE J. Solid-State Circuits, Vol. 32, No. 11, Nov.<br />

1997, pp. 1735-1742.<br />

8. J. Dreibelbis et al., “Processor-Based Built-In Self-Test <strong>for</strong> <strong>Embedded</strong><br />

<strong>DRAM</strong>,” IEEE J. Solid-State Circuits, Vol. 33, No. 11,<br />

Nov. 1998, pp. 1731-1740.<br />

9. R. Dekker, F. Beenker, and L. Thijssen, “Fault Modeling and<br />

Test Algorithm Development <strong>for</strong> Static Random Access Memories,”<br />

Proc. Int’l Test Conf., IEEE CS Press, 1988, pp. 343-352.<br />

10. A.J. van de Goor, “Using March Tests to Test SRAMs,” IEEE Design<br />

& Test of Computers, Vol. 10, No. 1, Mar. 1993, pp. 8-14.<br />

WE HAVE PROPOSED a flexible and cost-effective <strong>BIST</strong> design<br />

<strong>for</strong> embedded <strong>DRAM</strong>s. It supports march-based tests and<br />

diagnosis and timing specification tests. Our approach is flexible<br />

because additional test commands (other than march elements)<br />

can be included with little ef<strong>for</strong>t. It is cost-effective<br />

because test time is short, hardware overhead is low, and test<br />

coverage is high. It can also support burn-in if the <strong>DRAM</strong> core<br />

design is modified <strong>for</strong> that purpose. Together with the RAM-<br />

SES memory fault simulator, the proposed <strong>BIST</strong> approach<br />

Chih-Tsun Huang received the BSEE and<br />

MSEE degrees in electrical engineering from<br />

National Tsing Hua University, Hsinchu, Taiwan,<br />

where he is now working toward the<br />

PhD. His research areas include VLSI testing,<br />

embedded core and system design, design <strong>for</strong><br />

testability and reliability, and embedded<br />

memory testing. Huang is a student member of the IEEE.<br />

JANUARY–MARCH 1999 69


.<br />

<strong>BIST</strong> C<br />

Jing-Reng Huang received the BSEE and<br />

MSEE degrees in electrical engineering from<br />

National Tsing Hua University. He is currently<br />

working toward the PhD degree. His research<br />

interests are VLSI design, logic and<br />

memory built-in self-test, and computer arithmetic.<br />

Huang is a student member of the IEEE.<br />

Chi-Feng Wu received the BSEE and MSEE<br />

in electrical engineering from National Tsing<br />

Hua University and is currently working toward<br />

the PhD. His research interests are testing<br />

<strong>for</strong> programmable logic devices<br />

(including FPGAs and CPLDs), memory testing,<br />

and memory fault simulation. Wu is a student<br />

member of the IEEE.<br />

Cheng-Wen Wu is a professor in the Department<br />

of Electrical Engineering, National Tsing<br />

Hua University. He also has served as<br />

director of the university’s Computer and<br />

Communications Center and Technology Service<br />

Center. He was a guest editor of the Journal<br />

of In<strong>for</strong>mation Science and Engineering’s<br />

special issue on VLSI testing, and the technical program chair of the<br />

IEEE Fifth Asian Test Symposium. He received the 1996 NTHU<br />

Teaching Award and the 1997 Outstanding Electrical Engineering<br />

Professor Award from the Chinese Institute of Electrical Engineers<br />

(CIEE). Wu received the BSEE from National Taiwan University,<br />

Taipei, and the MS and PhD, both in electrical and computer engineering,<br />

from the University of Cali<strong>for</strong>nia, Santa Barbara. He is a<br />

member of CIEE and a senior member of IEEE.<br />

Tsin-Yuan Chang is an associate professor<br />

in the Department of Electrical Engineering,<br />

National Tsing Hua University. His research<br />

areas include VLSI design and testing, faulttolerant<br />

computing, and computer arithmetic.<br />

Chang received the BS from National Tsing<br />

Hua University and the MS and PhD from<br />

Michigan State University, all in electrical engineering. Chang is a<br />

member of the IEEE.<br />

Send questions and comments about this article to Cheng-Wen<br />

Wu, Dept. of Electrical Engineering, National Tsing Hua University,<br />

Hsinchu, Taiwan, ROC; cww@ee.nthu.edu.tw.<br />

70

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!