A Programmable BIST Core for Embedded DRAM - Laboratory for ...

. 

A Programmable BIST Core 

for Embedded DRAM 

CHIH-TSUN HUANG 

JING-RENG HUANG 

CHI-FENG WU 

CHENG-WEN WU 

TSIN-YUAN CHANG 

National Tsing Hua University 

The programmable BIST 

design presented here 

supports various test 

modes using a simple 

controller. With the 

March C − algorithm, the 

BIST circuit’s overhead is 

under 1.3% for a 1-Mbit 

DRAM and under 0.3% 

for a l6-Mbit DRAM. 

WITH THE ADVENT OF deep-submicron 

VLSI technology, ASIC vendors are turning 

toward single-chip systems that integrate 

cores from various sources. Memory is one 

of the most universal cores—almost all system 

chips contain some type of embedded 

memory. System designers have used embedded 

static RAMs (SRAMs) widely, since 

by merging memory with logic, they can increase 

data bandwidth and reduce hardware 

cost. Now, with pad-limited, multimilliongate 

designs, embedded dynamic RAM 

(DRAM) is also becoming an attractive core. 

It provides high-capacity storage at a higher 

data rate than commodity DRAM, whose 

data rate is limited by the number of pins 

available. Embedded DRAM also reduces 

overall power consumption and hardware 

cost. Merging DRAM and logic promises to 

benefit the system-IC industry. In fact, the 

combination has begun to appear in various 

ASIC and microprocessor designs and advanced 

computer architectures. 1 

Of course, merging DRAM and logic poses 

challenges, such as ensuring process optimization 

and creating design and test 

methodologies that guarantee performance, 

quality, and reliability. Testing embedded 

DRAMs is more difficult than testing commodity 

DRAMs. 2 One test issue is accessibility. 

When the DRAM core is embedded in a 

chip and surrounded by logic blocks, accessing 

it from an external memory tester is 

costly. A system requires design for testability 

for core isolation and tester access, and 

this exacts a price in hardware overhead, 

performance penalties, and noise and parasitic 

effects. Even if this price is manageable, 

testers for full qualification and testing of embedded 

DRAMs are much more expensive 

due to the increased speed and I/O data 

width of embedded memories (compared 

with packaged commodity memories). If we 

also consider engineering change, the overall 

investment is even higher. 

A promising solution to this dilemma is 

built-in self-test. Researchers have proposed 

many BIST schemes for embedded memories. 

3-8 BIST minimizes the embedded 

DRAM’s tester requirement and greatly reduces 

memory tester time throughout the test 

flow. It also reduces total test time since parallel 

testing at the memory bank and chip 

levels is easier. BIST is also a good way to protect 

the intellectual property contained in the 

core. The embedded DRAM core provider 

need only deliver the BIST activation and response 

sequences for testing and diagnosis 

without disclosing the detailed design. 

Table 1 (next page) shows a simplified embedded 

DRAM test flow, which has fewer test 

runs than the typical DRAM test flow. The 

table compares the required test support with 

and without BIST. The memory tester can 

JANUARY–MARCH 1999 0740-7475/99/$10.00 © 1999 IEEE 59

. 

BIST CORE 

Table 1. Embedded DRAM test runs and required test support 

with and without BIST. 

perform functional test, dc/ac parametric test, and redundancy 

analysis (for laser repair). A typical BIST design supports 

only functional test, but partial support of parametric 

test, redundancy analysis, and even self-repair is possible 

with increased logic overhead. As the table indicates, BIST 

can include the entire pre-burn-in test. BIST also simplifies 

the implementation of a burn-in board (which generates 

burn-in test patterns). Final test requires a memory tester only 

for speed sorting, and BIST can handle functional test. Since 

memory testers are very expensive, the reduction of tester 

time justifies the use of BIST. 

Here, we present a BIST design and implementation for 

embedded DRAM. It supports built-in self-diagnosis by feeding 

error information to the external tester. Moreover, using 

a specific test sequence, it can test for critical timing faults, 

reducing tester time for ac parametric test. The design supports 

wafer test, pre-burn-in test, burn-in, and final test. It is 

field-programmable; the user can program test algorithms 

using predetermined test elements (such as march elements, 

surround test elements, and refresh modes). The user can 

optimize the hardware for a specific embedded DRAM with 

a set of predetermined test elements. Our design is different 

from the microprogram-controlled BIST described in 

Dreibelbis et al., 8 which has greater flexibility but higher 

overhead. Because our design begins at the register-transferlanguage 

level, test element insertion (for higher test coverage) 

and deletion (for lower hardware overhead) are 

relatively easy. 

Fault models 

We consider faults that may occur in the DRAM core’s address 

decoder, read/write circuitry, and memory cell array. 

We categorize address decoder faults (AFs) according to 

their functional behavior: 9,10 

■ 

■ 

■ 

■ 

Test run Isolation only Isolation and BIST 

Wafer probe Tester Tester/BIST 

Pre-burn-in test Tester BIST 

Burn-in Burn-in board BIST 

Final test Tester Tester/BIST 

A certain address cannot access any cell. 

A certain address accesses multiple cells simultaneously. 

No address can access a certain cell. 

Multiple addresses can access a certain cell. 

Typical faults of the read/write circuitry (including buses, 

sense amplifiers, and write buffers) are equivalent to 

faults in the memory cell array. For memory cell array faults, 

we follow the notation used in van de Goor: 10 

■ ↑ —a rising cell transition (due to a write operation) 

■ ↓ —a falling cell transition 

■ b —either a rising or a falling cell transition 

■ ∀ —any operation at a cell 

■ —a fault in a cell, where S is the value or operation 

activating the fault, F is the cell’s faulty value, S ∈ 

{0, 1, ↑, ↓, b}, and F ∈ {0, 1} 

■ —a fault involving m cells, where S 1 , 

… , S m− 1 are the conditions of the first m−1 cells required 

to activate the fault in cell m, F is the faulty value of cell 

m, and for all 1 ≤ i ≤ m, S i ∈ {0, 1, ↑, ↓, b} 

■ 

■ 

■ 

■ 

■ 

The following are typical faults in the memory cell array: 9,10 

Stuck-at fault (SAF)—a cell or line sticks at 1 or 0; 

denotes a stuck-at-1 fault and a stuck-at-0 fault. 

Stuck-open fault (SOF)—a cell is not accessible due to, 

for example, a broken line or a permanent open switch. 

Transition fault (TF)—a cell fails to make a transition; it 

can be or . 

Data retention fault (DRF)—a cell fails to retain its logic 

value after a prespecified period of time. 

Coupling faults: 

Inversion (CFin)—a transition in one cell inverts the 

content of another; that is, or . 

Idempotent (CFid)—a transition in one cell forces a constant 

value (1 or 0) into another; that is, , , , or . 

State (CFst)—a coupled cell or line is forced to a certain 

value only if the coupling cell or line is in a given 

state; that is, , , , or . 

The preceding single-cell fault models also apply to wordoriented 

memories. Coupling faults between cells in different 

words behave the same as in a bit-oriented memory. But 

coupling between cells inside the same word will virtually 

disappear if the write operation can override the coupling 

effect; that is, the write operation can correct the fault. In 

that case, the coupling fault can be detected only when its 

effect is stronger than the write operation. For example, say 

that a 4-bit word b 3 b 2 b 1 b 0 has a CFst , where b 3 couples 

b 2 . Then writing 0101 to b 3 b 2 b 1 b 0 results in a faulty value of 

0001 when CFst is stronger than the write operation; otherwise, 

the fault effect is masked. 

Test algorithms 

Our BIST scheme’s default test algorithms are the march al- 

60 IEEE DESIGN & TEST OF COMPUTERS

. 

gorithms. Table 2 shows a bitoriented 

March C − algorithm 

as an example. 10 M 0 , …, M 5 

denote the six march elements. 

In each march element, 

we first specify the 

address sequence: ⇑ means 

the address sequence is in ascending 

order, ⇓ means the 

address changes in descending 

order, and c 

means either ⇑ or ⇓ is acceptable. 

Consider M 1 , for example; 

the address sequence 

begins at the lowest address 

and changes in ascending order 

toward the highest. For 

each address (memory cell), 

the algorithm performs a 

read operation (with an expected 

0 in the fault-free 

case), writes back the complemented 

bit immediately, 

and then continues to the 

next address. The algorithm 

is also called the March 10N 

algorithm because it requires 

10N read/write operations, 

where N is the number of 

memory cells (address locations). 

The March C − completely 

detects SAFs, unlinked AFs, 

unlinked TFs, and CFs (including 

CFins, CFids, and 

CFsts). 10 It also detects SOFs 

if we extend M 1 to R0W1R1, 

or M 2 to R1W0R0. The resulting 

algorithm is called the 

Table 2. March C⎺ algorithm (R: read; W: write). 

extended March C − algorithm. Since our embedded DRAM 

is word-oriented, we modify the 10N algorithm as ⇑(Wa); 

⇑(RaWa′); ⇑(Ra′Wa); ⇓(RaWa′); ⇓( Ra′Wa); ⇑(Ra), where a 

represents a data word (the background word) and a′ is its 

complement. This word-oriented algorithm reduces to the 

bit-oriented algorithm when a is a single bit. 

We select background words on the basis of the defined 

fault models and required fault coverage. Exhaustive data 

backgrounds are usually unaffordable and unnecessary. 

Although the word-oriented March C − algorithm detects all 

the SAFs, unlinked AFs, TFs, and SOFs, coupling faults in the 

same word may not be detectable. The choice of data backgrounds 

determines the coverage of this kind of fault. 

March element M 0 M 1 M 2 M 3 M 4 M 5 

Address sequence c(W 0); ⇑(R0W 1); ⇑(R1W 0); ⇓(R 0W1); ⇓(R1W 0); c(R 0) 

Table 3. Three algorithms’ fault coverage (%): MATS++ (a); March X (b); March C⎺ (c). 

Fault P 1 P 2 P 3 P 2,3 P 1,2,3 P all 

SAF 100.0 100.0 100.0 100.0 100.0 100.0 

SOF 100.0 100.0 100.0 100.0 100.0 100.0 

TF 100.0 100.0 100.0 100.0 100.0 100.0 

AF 99.7 99.9 99.9 100.0 100.0 100.0 

CFin 100.0 100.0 100.0 100.0 100.0 100.0 

CFid 37.5 37.5 37.5 62.6 75.9 89.1 

CFst 50.0 50.0 50.0 75.0 87.5 100.0 

(a) 

SAF 100.0 100.0 100.0 100.0 100.0 100.0 

SOF 0.8 0.8 0.8 0.8 0.8 0.8 

TF 100.0 100.0 100.0 100.0 100.0 100.0 

AF 99.7 99.9 99.9 100.0 100.0 100.0 

CFin 100.0 100.0 100.0 100.0 100.0 100.0 

CFid 50.0 50.0 50.0 78.1 90.7 100.0 

CFst 62.5 62.5 62.5 84.4 93.0 100.0 

(b) 

SAF 100.0 100.0 100.0 100.0 100.0 100.0 

SOF 0.8 0.8 0.8 0.8 0.8 0.8 

TF 100.0 100.0 100.0 100.0 100.0 100.0 

AF 99.7 99.9 99.9 100.0 100.0 100.0 

CFin 100.0 100.0 100.0 100.0 100.0 100.0 

CFid 99.9 99.9 99.9 99.95 100.0 100.0 

CFst 99.9 99.9 99.9 99.95 100.0 100.0 

(c) 

Fault coverage evaluation 

We know a march algorithm’s coverage of its target faults 

by definition. However, to know its coverage of other faults 

requires further analysis. For example, the March X algorithm 

was designed to test all AFs, SAFs, TFs, and CFins, so 

its coverage of these faults is 100%. But to determine its coverage 

of CFids and CFsts, we must perform analysis. 

Moreover, for a word-oriented memory such as we are discussing 

here, fault coverage also depends on the selected 

data backgrounds. Since there are so many possible faults 

and test algorithms (including address sequences, 

read/write operations, and data patterns/backgrounds), determining 

the algorithm that best balances cost and test cov- 

JANUARY–MARCH 1999 61

. 


Memory BIST 

Q 

D 

18-bit 

address 

xRAS 

xCAS 

xWE 

Column address 

buffers 

Row address 

buffers 

Refresh 

controller 

Timing 

controller 

erage is difficult. 

We group faults into two classes: single-cell faults and faults 

involving two cells (such as coupling faults). We can test for 

single-cell faults such as SAFs with an algorithm using any 

single data background because it tests all cells in the same 

way as for a bit-oriented memory. Two-cell faults, however, 

depend on the strength of the write operation and the coupling 

effect. Suppose the write operation erases the coupling 

effect between two cells in the same word. Such faults are 

redundant, and we must consider only coupling between 

two different words, so one background is sufficient. 

However, if the coupling effect is stronger than the write operation, 

we must consider coupling faults inside a word. This 

is the assumption in the following analysis. 

We can derive fault coverage by manual analysis, but that 

is tedious and sometimes impractical for complex test algorithms 

and fault models. Instead, we implemented a memory 

fault simulator called RAMSES (RAM Simulator for Error 

Screening) for this purpose. For a word-oriented memory 

with 4-bit words, the data backgrounds (patterns) commonly 

used are 0000 (P 1 ), 0101 (P 2 ), and 0011 (P 3 ). To make the list 

complete, we also consider 0110 (P 4 ), 0001 (P 5 ), 0010 (P 6 ), 

0100 (P 7 ), and 1000 (P 8 ). We simulated several test algorithms 

with RAMSES, assuming a 1-Kbyte word-oriented embedded 

DRAM with 4-bit words. Table 3 shows the fault 

coverage simulation results for the algorithms, in which P 2,3 

stands for {P 2 , P 3 }, P 1,2.3 for {P 1 , P 2 , P 3 }, and P all for {P 1 , P 2 , …, 

P 8 }. We show only the results for some data backgrounds, 

16 

16 

8 

10 

Row decoder 

1024 

Figure 1. Embedded EDO DRAM connected to BIST circuitry. 

Data out registers 

Data in registers 

Column decoder 

256 

Sense amplifiers 

1-Mbit 

memory 

array 

16 

16 

4-Mbit embedded 

EDO DRAM 

although we performed extensive 

simulations. We 

found that, in general, P 2 

provides the highest fault 

coverage among single 

backgrounds, and P 2,3 is the 

best among double backgrounds. 

For triple backgrounds, 

P 1,2,3 provides the 

highest fault coverage. 

Intuitively, uniformity is not 

desirable so far as testing is 

concerned. 

For DRAMs, we may have 

to consider additional faults, 

such as neighborhood-pattern-sensitive 

and linked 

faults. If such faults are to be 

targeted after failure analysis, 

we need simulation to select 

the best test algorithms 

for them. 

The simulation results 

show that using multiple data 

backgrounds significantly increases the coverage of coupling 

faults for MATS++ 10 and March X compared with using a single 

background. However, for March C − , the improvement is 

minor—with only a background P 1 , the algorithm covers most 

of the faults. (The SOF fault coverage in Table 3c will reach 

100% if M 1 is extended to RaWa′Ra′.) Using an additional 

background doubles the test time but detects only a small 

percentage of additional faults (intraword coupling faults). 

Also, for larger DRAMs (with the same word length), the fault 

coverage of March C − does not decrease; rather, it increases 

since the percentage of undetected faults decreases. 

DRAM specification and BIST design strategy 

Since extended data-out (EDO) DRAMs are common, we 

use a 1-Mbit × 4 EDO DRAM as our example for explaining 

the proposed BIST design. Of course, one can easily apply 

the scheme to other embedded DRAM architectures. Our 

design assumes the embedded DRAM has four memory 

banks, each organized as a 1-Mbit array; thus, it has 256K addressable 

locations, each containing four bits. Figure 1 

shows a block diagram of the embedded DRAM with the 

proposed BIST scheme (detailed later). The timing controller 

controls the address buffers, data I/O buffers, and refresh 

mechanism via signals xRAS, xCAS, and xWE, which 

represent row address strobe, column address strobe, and 

write enable. Embedded DRAMs normally use separate I/O 

channels instead of multiplexed pins as in commodity 

DRAMs, so row and column addresses and data input (D) 


. 

and output (Q) channels are 

all separate. 

One of the challenges of 

memory BIST is that the 

asynchronous memory core 

(traditional RAMs are asynchronous) 

must be tested by 

the synchronous BIST logic. 

This is especially difficult in 

embedded DRAMs. To illustrate 

how we cope with this 

problem, we use typical 

EDO DRAM timing specifications. 

The proposed strategy 

is not limited to the 

given EDO DRAM architecture. 

Figure 2 shows the 

typical EDO page-mode 

read-write cycle. Although in 

this case D and Q share the 

same I/O channel, our strategy 

still works (timing control 

of separate I/O channels 

is in fact easier). Table 4 lists 

the values of the timing parameters 

shown in Figure 2. 

The EDO page mode’s timing 

depends mainly on the 

edges of the four signals 

xRAS, xCAS, xWE, and xOE. 

They determine the time to 

latch the row address, column 

address, and input data 

for the memory core, as well 

as the output data for use by 

other cores. 

For embedded DRAM, 

which has no pin count limitation, 

D and Q can be separate 

to simplify control; 

therefore, output enable signal 

xOE can be removed 

without affecting functionality. 

The BIST logic, however, 

still needs xOE to indicate the arrival of the output data. 

We must determine an appropriate BIST clock period 

based on the xCAS cycle time (period) of the EDO page 

mode, t CAS in Table 4. In our example, the minimum t CAS is 

10 ns, which we use as the basis of the test clock period. We 

can select a test clock of up to 100 MHz. 

Instead of a memory tester, we need only a simple logic 

tester, which is slower and less expensive, to activate the 

Table 4. Timing parameter values of EDO page-mode read-write cycle. 

Parameter Min (ns) Max (ns) Description 

t AA 25 Access time from column address 

t ASC 0 Column address setup time 

t ASR 0 Row address setup time 

t AWD 42 Column-address-to-xWE delay 

t CAC 13 Access time from xCAS 

t CAH 10 Column address hold time 

t CAS 10 10,000 xCAS active pulsewidth 

t CP 10 xCAS precharge pulsewidth 

t CWD 28 xCAS-to-xWE delay 

t DH 10 D hold time 

t DS 0 D setup time 

t OD 0 12 Output disable 

t OEA 12 Access time from xOE 

t RAC 50 Access time from xRAS 

t RASP 55 125,000 xRAS (EDO page-mode) pulsewidth 

t RCD 12 xRAS-to-xCAS delay 

t RAH 10 Row address hold time 

t RP 30 xRAS precharge pulsewidth 

t WP 5 Write pulsewidth 

xRAS 

xCAS 

Addr 

xWE 

DQ 

xOE 

t RP 

t RCD 

t RASP 

t CAS t CP 

t 

t ASC 

t RAH 

ASR t CAH 

Row Column Column Column 

t CWD t WP 

t AWD 

t AA 

t CAC 

t DS 

t RAC 

t OEA 

Q D Q D 

t OD 

Figure 2. Timing diagram of EDO page-mode read-write cycle. 

BIST logic and receive the test result. The BIST sequencer 

(a timing sequence generator) generates timing signals 

based on the clock period; that is, it converts a long or short 

timing signal duration to a certain number of clock periods. 

Once the signal is converted, it is fixed in the BIST design; 

therefore, we must determine the clock period with care to 

avoid violation of the embedded DRAM’s timing specifications. 

In our example, we assume the clock period (and the 

t DH 

Q 

D 


. 


xCAS period) to be 20 ns (though it can be reduced almost 

to 10 ns). Once the clock period is fixed, we determine the 

other two related timing parameters, xRAS and xWE, accordingly. 

Also, we shift and stretch the address, D, and Q signals 

in the original timing diagram (Figure 2) according to 

the following rules. Note that Table 4 specifies t RAC , for example, 

as no more than 50 ns. This means the embedded 

DRAM design guarantees that Q is available 50 ns after the 

falling transition of xRAS (see Figure 2); thus, the sampling 

of Q should take place at least 50 ns after xRAS. 

■ 

■ 

The row address must be ready before xRAS goes low, 

and the column address must be ready before xCAS 

goes low. The time the address is stable before the address 

strobe is usually more than one clock cycle, meeting 

our design’s 0-ns setup time requirement. Also, the 

address remains stable for more than one clock cycle, 

meeting the 10-ns hold time requirement. 

The timing requirement for input data D is the same as 

for the column address. 

■ The major parameters related to output data Q are t AA , 

t RAC , and t CAC , which are 25 ns (max), 50 ns (max), and 

13 ns (max). The xCAS low period (t CAS ) should span at 

least two clock cycles, since Q will settle at the beginning 

of the second cycle, and the clock cycle (20 ns) is 

longer than 13 ns. Since the column address is ready 

one clock cycle before the transition of xCAS (see the 

first rule), we let the time from xCAS to Q be two clock 

cycles to satisfy the t RAC constraint. Finally, the first falling 

transition of xCAS in page mode is delayed for one more 

clock cycle, so there are at least three clock cycles from 

xRAS to Q, satisfying the t RAC specification. 

■ 

For the write operation (as in the page-mode read-write 

cycle), the key parameters are t AWD and t CWD . Because Q 

is sampled at the second clock after xCAS goes low, one 

BCK 

xRAS 

xCAS 

ADDR 3ff 0ff 0fe 0fd 

xWE 

Data in 

Q out aaaa aaaa aaaa 

xOE 

Figure 3. A timing diagram generated by the sequencer. 

5555 5555 5555 

more clock cycle must be inserted into the low period 

of xCAS, making it at least three cycles. 

Following these rules, the sequencer can generate waveforms 

of the critical timing parameters to meet the specification. 

Slight adjustments may be necessary for other timing 

parameters. Figure 3 shows the waveform diagram of RaWa′ 

generated by the sequencer according to the rules and plotted 

by a timing simulator. The sequencer can also generate 

waveforms for other march elements and retention and refresh 

test elements according to similar rules. 

BIST architecture and function 

Figure 4 diagrams our BIST design and the interface between 

the BIST logic and the embedded DRAM. The BIST activation 

control (BAC) input activates the BIST logic; the 

embedded DRAM is in normal mode when BAC is 0 and in 

BIST mode when BAC is 1. The BIST controller is a finite-state 

machine; its state transition is controlled by the BIST control 

selection (BCS) input. The BIST controller also controls the 

scan chains, shifting in test patterns and commands from the 

BIST scan-in (BSI) input and shifting out results from the BIST 

scan-out (BSO) output. As Figure 4 shows, the controller contains 

multiple chains. The decode logic and test mode selection 

modules determine the proper data register to scan in 

the test commands and subsequently activate the sequencer. 

The sequencer generates the DRAM’s timing sequence, with 

the help of some built-in counters and the timing generator. 

The comparator compares and reports any discrepancy between 

the output data from the DRAM and the original input 

data generated by the sequence controller. 

The BIST logic has three additional I/O signals. The BIST 

ready flag (BRD*) indicates when the BIST sequence is finished, 

so that the go/no-go indicator signal (BGO) can be 

sampled to check that the embedded DRAM is functioning 

correctly. The ⎺B⎺R⎺S*/ 

SCAN signal acts as both the 

reset and scan test control. 

All registers in the BIST controller 

finite-state machine 

are scanned, and before we 

use the BIST logic to test the 

DRAM, the logic itself is scan 

tested. Finally, we need a 

BIST clock (BCK) input. 

BCK and BAC must be 

dedicated; others cannot 

share these two input pins 

(for example, by using multiplexers). 

But BRD* is optional 

and may be removed 

if pin count is a concern. In 


. 

BGO 

BRD* 

Sequencer 

xOB 

Data background 

composer 

Comparator 

Q 

D 

16 

16 

Row address counter 

Column address counter 

BSO 

Sequence 

controller 

Control 

counter 

Timing 

generator 

DRAM interface buffers 

18-bit 

address 

4-Mbit 

embedded 

EDO 

DRAM 

BIST scan path 

Burn-in commands 

BSI 

March commands/data 

xRAS 

Diagnosis information 

xCAS 

BCK 

Test mode selection 

xWE 

Decode logic 

BCS 

BAC 

BRS*/SCAN 

Controller 

BIST controller 

Memory BIST 

Figure 4. Block diagram of the proposed BIST design connected to the embedded EDO DRAM. 

that case, we can encode BGO to signal the completion of 

the BIST sequence and show the test result. The reset 

(⎺B⎺R⎺S*) also is optional, since a short synchronizing sequence 

for the BIST controller can be the reset sequence. 

However, the SCAN pin is still required in that case. Apart 

from BCK and BAC, all other BIST I/O signals can share pins 

with signals outside the DRAM core; thus, we can use multiplexed 

pins to reduce pin overhead. 

The proposed BIST supports the following test modes: 

■ 

Scan test—testing the BIST logic, except the BIST controller 

finite-state machine. We execute scan test at the 

beginning of the BIST sequence to ensure the circuit’s 

■ 

■ 

correct functionality. In addition, we test all registers in 

the DRAM core in this mode. 

Memory BIST—functional testing of the DRAM using 

march algorithms. This mode exercises various operation 

modes, such as non-page-mode test, page mode 

test, refresh test, and retention test. This mode also supports 

diagnosis. In that case, the BIST logic can shift out 

the address of any faulty cell, column, or row to the external 

tester via the scan mechanism. We can test for retention 

faults in this mode or in a separate test mode. 

Burn-in—stress testing to screen out unreliable parts that 

may fail in infancy. This mode uses the BIST logic to exercise 

the entire memory cell array in a more efficient 


. 


■ 

0 

0 

0 

BCS = 0 

Initial 

1 

Test_mode_in 

0 

Decode 

1 

Data_in_out 

0 

Apply 

1 

Execute 

Exit 

0 

1 

Probe/pause 

Figure 5. BIST controller state diagram. 

0 

1 

1 

1 

1 

Initial/reset state: all BIST outputs 

retain safe values 

Test mode selection 

Command decoding 

Data scan: shift in test inputs and 

shift out results 

Scan test application and 

BIST activation 

Memory function test, 

BI, AC test, etc. 

Pause for observation, or exit the 

execution phase 

Shift out results, 

or pause for retention test 

method than the normal read/write access. The default 

burn-in test is to use a march algorithm supported in the 

memory BIST mode. 

Timing-fault test—testing for critical timing faults by running 

the BIST clock at an appropriate speed. Among 

these faults are incorrect setup time, hold time, and data 

arrival time of various control and data signals. We can 

simultaneously detect some timing faults, such as incorrect 

setup time and hold time, when we perform 

functional test (in memory BIST mode). We can test for 

others by using different BIST clock periods or an external 

memory tester. 

Our design can include other test modes if necessary, 

since the control scheme is flexible. Of course, for dc parameter 

test, we must still use an external tester’s parameter 

measurement unit. 

BIST implementation 

As shown in Figure 4, the BIST logic consists of two parts: 

controller and sequencer. The controller takes charge of the 

overall BIST flow, and the sequencer generates the embedded 

DRAM’s address, data, and timing sequences. At the 

ASIC level, logic BIST and memory BIST can share the same 

controller, and the on-chip processor can function as the sequencer 

during memory BIST mode. However, for a DRAM 

core delivered as intellectual property to be embedded in 

various chips, we must integrate a complete BIST circuit with 

the DRAM core. We consider the latter case. 

Controller. After the scan test mode has finished successfully, 

we enter the memory BIST mode. The BIST controller 

finite-state machine controls the scan test and BIST 

flow to test the rest of the BIST circuitry and the embedded 

DRAM. Figure 5 shows the state diagram of the proposed finite-state 

machine. Each arc in the figure represents a state 

transition controlled by BCS. We enter the initial state by asserting 

⎺B⎺R⎺S*/SCAN low or applying a synchronizing sequence. 

By applying four continuous 0’s on BCS, we can 

return to the initial state from any other state. 

From the initial state, we enter the test_mode_in state if 

BCS is 1. In this state, we select the intended test mode. The 

decode state generates all internal control signals, including 

those for the selection of the proper scan chain for the 

data sequence to be shifted in. User-specified parameters 

and the test algorithm are shifted in during the data_in_out 

state. Note that the decode, data_in_out, and apply states 

form a loop for running the scan test. We perform other test 

modes in the execute state. For memory core testing and diagnosis, 

we enter the bottom loop, which contains the execute, 

exit, and probe/pause states, and collect the error 

information in the probe/pause state. We can also run retention 

test in the probe/pause state, which allows pausing 

for a user-determined interval. 

An alternative approach is to add an extra mode in the sequencer, 

using a counter for measuring the time interval from, 

for example, xCAS to xWE. We can derive appropriate timing 

sequences using similar rules as for march tests. When diagnosis 

is required, the sequencer tests the entire memory core; 

in other words, the process does not stop immediately when 

an error is detected. It is not necessary to continue the testing 

process when an error is found if we perform testing but not 

diagnosis. The sequencer will simply halt and indicate that 

an error is found, and the controller can go back to the decode 

state through the exit state. From there, either we can 

reach the initial state, or we can reenter the data_in_out state 

again. The apply, execute, exit, and probe/pause states can be 

merged if diagnosis is not required. 

Figure 6 shows the BIST circuit’s timing diagram (the entire 

control sequence). As discussed earlier, when BAC is 1, 


. 

the DRAM enters the memory 

BIST mode, in which every 

BCK 

signal is synchronized to BAC Normal mode BIST mode 

BIST clock BCK. As depicted 

in the figure,⎺B⎺R⎺S*/SCAN is BRS*/SCAN 

Controller test 

pulled high at the beginning 

of the memory BIST mode to BCS 

Scan test control 

perform the scan test verifying 

the BIST controller’s correctness. 

A scan chain forms 

BRD* 

between BSI and BSO for applying 

patterns and collect- 

BGO 

BSI 

Test patterns 

ing responses in this phase. 

After the scan,⎺B⎺R⎺S*/SCAN is 

BSO 

Test outputs 

pulled low to reset the BIST 

controller (BCS remains low Figure 6. BIST circuit control sequence. 

to generate the reset sequence 

if necessary). The 

BIST controller then performs a scan test for the rest of the 

BIST circuit. The test algorithm is subsequently applied according 

to the control flow discussed earlier and the finitestate 

machine shown in Figure 5. Finally, after BRD* is 

asserted high and BGO is sampled, we let BAC equal 0 to return 

the DRAM to normal mode. 

In the controller, we implemented several default 

read/write commands, address orders, data backgrounds, 

and EDO DRAM access modes. The built-in read/write commands 

are Ra (read the expected word a), Wa (write word 

a), RaWa′ (read word a, complement, and then write back 

immediately), and RaWa′Ra′. The default address orders include 

⇑ and ⇓, implemented by an up-down counter. The 

built-in access modes to be used in conjunction with the address 

orders are row scan, column scan, page-mode column 

scan, and refresh. The data background word (a) is supplied 

online. Each march element is a combination of the appropriate 

read/write command, address order, access mode, 

and data background. 

In addition to the march commands, our BIST also supports 

diagnosis, burn-in, and retention test. Other test commands 

can be integrated easily. In our scheme, a test 

algorithm is a sequence of commands entered from the BSI 

pin to the scan chains, decoded, and executed (see Figure 

4). When the controller encounters a special end-ofalgorithm 

command, it detects the end of a test algorithm. 

In our default implementation, most march algorithms can 

be programmed, including the extended March C − , March 

X, March Y, MATS++, and others. 

Sequencer. In designing the sequencer, our major goal was 

flexibility. Our sequencer design can be used for a wide range of 

DRAM cores with various operation modes, memory dimensions, 

and timing specifications. Figure 7 (next page) shows the state 

Reset 

sequence 

Scan test 

Scan in 

Scan out 

Commands/data 

BIST control sequence 

Go/No-Go 

Observe 

diagram of the sequence controller finite-state machine (see 

Figure 4). As the state diagram shows, we implemented timing 

sequence generation modules for the single read/write commands 

and the page-mode read/write commands for the march 

elements defined in the controller. We also implemented a refresh 

timing generation module for refresh tests. The figure shows 

the sequence controller’s default implementation, which is designed 

for march tests, but we can easily extend it to other algorithms. 

An important concern is that the sequencer’s outputs be 

glitch-free, and that they be in high impedance when BIST is not 

in use—that is, in normal operation mode. Our implementation 

takes these requirements into consideration. In Figure 7, the state 

transition is on BCK’s rising edge, while the control (timing) signals 

for the DRAM core are applied on BCK’s falling edge. 

Consequently, the sequencer’s outputs are guaranteed glitch-free. 

When the embedded DRAM is in normal mode, the sequence 

controller is in the idle state, where it stays until the 

BIST controller enters the execute state. Then the sequence 

controller fetches the march commands, enters the reset 

state, and carries out the sequence for the specified memory 

access mode. For memory access modes such as Ra, Wa, 

RaWa′, and RaWa′Ra′, the timing waveform is periodic, and 

the period depends on the row access cycle. We inserted 

appropriate refresh cycles for refresh timing. The page-mode 

access cycle consists of the row access and column access 

cycles. The DRAM core latches the row address first; then it 

latches the column addresses of the whole page one by one. 

In the self-refresh/hidden-refresh/RAS-only-refresh state, the 

embedded DRAM is tested for its refresh mechanism. 

As shown in Figure 4, the sequencer design is based on 

counters. If the memory size increases, only the lengths of 

the row address and column address counters and the size 

of the comparator increase. Only one additional bit is required 

for an address counter when memory size doubles, so 


. 


Reset 

Idle 

Refresh 

Ra 

Wa 

RaWa′ 

RaWa′Ra′ 

Page-mode 

row 

Page-mode 

column Ra 

Page-mode 

row 

Page-mode 

column Wa 

Page-mode 

row 

Page-mode 

column 

RaWa′ 

Page-mode 

row 

Page-mode 

column 

RaWa′Ra′ 

Selfrefresh/ 

hidden 

refresh/ 

RASonly 

refresh 

Done 

Figure 7. State diagram of the sequence controller for march tests. 

Table 5. Comparison of embedded-DRAM test application methodologies. 

Test method Test time Hardware cost Coverage 

Ours Short Low Functional, timing, burn-in 

Processor-based Short Low Functional 

Scan Long Low Functional 

Tester Short Very high Functional, timing, dc 

hardware overhead is low. The control counter, designed to 

meet the refresh time specification, is used for retention/refresh 

test. Refresh time specifications for various DRAMs currently 

in use do not differ much regardless of size, so the 

sequencer’s area overhead actually drops when the DRAM 

core’s size increases. In our example, a 21-bit counter suffices 

if the refresh cycle does not exceed 32 ms. The size of 

the entire BIST logic for the EDO DRAM core, without burnin 

and redundancy analysis, is about 2,000 to 3,000 gates. 

Following a command decoded by the controller, the sequencer 

generates the DRAM core’s required output signals, 

using a small look-up table (LUT). The LUT-based design reduces 

design effort and hardware cost by allowing new test 

commands to be added easily. When the timing specifications 

change, a simple program generates the LUT content 

automatically. The LUT-based design is an important step 

toward a BIST compiler for embedded DRAMs. It is configurable 

at the register-transfer-language level but does not 

modify the architecture. For non-march algorithms such as 

pseudorandom and surround test, one must design specific 

address counters or counter configurations for the sequencer 

and add new commands to the state diagram. 

Discussion 

We used a commercial synthesis tool and a single-poly, 

triple-metal logic cell library to estimate our BIST circuit’s area. 

Figure 8 plots BIST area overhead with respect to various 

DRAM core sizes. The DRAM areas are based on areas of existing 

0.25-µm and 0.35-µm EDO DRAM chips reported by major 

vendors. Comparisons are based on the areas estimated 

by the synthesis tool, so they are not exact. Also, it is impossible 

(and unnecessary) for us to project the precise size of 

our BIST circuit on all these DRAM chips. Since the overhead 

is very low, we expect the exact area overhead numbers to 

be close to those shown in the figure. Thus, the BIST area overhead 

for our default BIST design is about 1.3% for a 1-Mbit embedded 

DRAM and negligible for a 64-Mbit version. Even for 

the 16-Mbit DRAM, the most popular embedded DRAM candidate 

currently, the BIST area overhead is less than 0.3%. 

Clearly, the larger the DRAM core, the smaller the BIST area 

overhead. 

Since the area overhead is 

low, one can include more 

test modes and algorithms to 

increase coverage, as long 

as test time does not become 

a problem. The test 

time for non-page-mode 

March C − , for example, is 

about 0.4 seconds for the 4- 


. 

Memory area (µm 2 ) 

140 

120 

100 

80 

60 

40 

20 

Memory area 

BIST overhead 

0 0 

1 4 16 64 

Memory size (Mbits) 

1.4 

1.2 

1 

0.8 

0.6 

0.4 

0.2 

BIST area overhead (%) 

makes designing and implementing appropriate BIST circuits 

for various embedded DRAMs systematic and easy. 

This BIST design has been implemented in an industry project 

in which we plan to evaluate its effectiveness in the future. 

Meanwhile, under demand from industry, we are working on 

an extended version that will incorporate built-in redundancy 

analysis and support more fault models. 

Acknowledgments 

Global UniChip Corporation (GUC), Hsinchu, Taiwan, under 

contract NTHU-0987-113J6, partly supported this work. 

Figure 8. Area overhead of the BIST core. 

Mbit core (assuming a 50-MHz clock). It increases approximately 

in proportion to the address space. To reduce test 

time, one can explore parallel testing of multiple banks or 

even multiple words by separate BIST sequencers, but that 

requires very careful modification of the memory core. Note 

that after dicing, an external memory tester is not needed 

until after burn-in—an important BIST benefit. 

Table 5 qualitatively summarizes some embedded DRAM 

test application methods, including our BIST implementation, 

processor-based BIST, scan-based serial testing, 4 and direct 

memory tester access. Our method is very suitable for embedded 

DRAM, especially when the BIST circuitry is to be integrated 

with the DRAM core to form a single piece of 

intellectual property for use in various ASICs. Moreover, by 

properly configuring the controller and sequencer, we can 

support timing-fault testing and burn-in. Processor-based BIST 

is also popular in an ASIC environment, where an existing 

processor is available for the DRAM BIST designer. But it is 

not suitable for an embedded DRAM designed as a standalone, 

reusable intellectual property core. Serial BIST is not 

popular because it is slow. Also, the second and third methods 

normally support only functional test. The external memory 

tester is the most powerful but most expensive method. It 

supports all kinds of test (except burn-in) with very high resolution 

and programmability—and a very high cost. 

References 

1. D. Patterson et al., “A Case for Intelligent RAM,” IEEE Micro, 

Vol. 17, No. 2, Mar.-Apr. 1997, pp. 34-44. 

2. “A D&T Roundtable: Testing Mixed Logic and DRAM Chips,” 

IEEE Design & Test of Computers, Vol. 15, No. 2, Apr.-June 1998, 

pp. 86-92. 

3. R. Dekker, F. Beenker, and L. Thijssen, “A Realistic Self-Test 

Machine for Static Random Access Memories,” Proc. Int’l Test 

Conf., IEEE Computer Society Press, Los Alamitos, Calif., 1988, 

pp. 353-361. 

4. B. Nadeau-Dostie, A. Silburt, and V.K. Agarwal, “Serial Interface 

for Embedded-Memory Testing,” IEEE Design & Test of 

Computers, Vol. 7, No. 2, Apr. 1990, pp. 52-63. 

5. R. Treuer and V.K. Agarwal, “Built-In Self-Diagnosis for Repairable 

Embedded RAMs,” IEEE Design & Test of Computers, 

Vol. 10, No. 2, June 1993, pp. 24-33. 

6. P. Camurati et al., “Industrial BIST of Embedded RAMs,” IEEE 

Design & Test of Computers, Vol. 12, No. 3, Fall 1995, pp. 86-95. 

7. S. Tanoi et al., “On-Wafer BIST of a 200-Gb/s Failed-Bit Search 

for 1-Gb DRAM,” IEEE J. Solid-State Circuits, Vol. 32, No. 11, Nov. 

1997, pp. 1735-1742. 

8. J. Dreibelbis et al., “Processor-Based Built-In Self-Test for Embedded 

DRAM,” IEEE J. Solid-State Circuits, Vol. 33, No. 11, 

Nov. 1998, pp. 1731-1740. 

9. R. Dekker, F. Beenker, and L. Thijssen, “Fault Modeling and 

Test Algorithm Development for Static Random Access Memories,” 

Proc. Int’l Test Conf., IEEE CS Press, 1988, pp. 343-352. 

10. A.J. van de Goor, “Using March Tests to Test SRAMs,” IEEE Design 

& Test of Computers, Vol. 10, No. 1, Mar. 1993, pp. 8-14. 

WE HAVE PROPOSED a flexible and cost-effective BIST design 

for embedded DRAMs. It supports march-based tests and 

diagnosis and timing specification tests. Our approach is flexible 

because additional test commands (other than march elements) 

can be included with little effort. It is cost-effective 

because test time is short, hardware overhead is low, and test 

coverage is high. It can also support burn-in if the DRAM core 

design is modified for that purpose. Together with the RAM- 

SES memory fault simulator, the proposed BIST approach 

Chih-Tsun Huang received the BSEE and 

MSEE degrees in electrical engineering from 

National Tsing Hua University, Hsinchu, Taiwan, 

where he is now working toward the 

PhD. His research areas include VLSI testing, 

embedded core and system design, design for 

testability and reliability, and embedded 

memory testing. Huang is a student member of the IEEE. 


. 

BIST C 

Jing-Reng Huang received the BSEE and 

MSEE degrees in electrical engineering from 

National Tsing Hua University. He is currently 

working toward the PhD degree. His research 

interests are VLSI design, logic and 

memory built-in self-test, and computer arithmetic. 

Huang is a student member of the IEEE. 

Chi-Feng Wu received the BSEE and MSEE 

in electrical engineering from National Tsing 

Hua University and is currently working toward 

the PhD. His research interests are testing 

for programmable logic devices 

(including FPGAs and CPLDs), memory testing, 

and memory fault simulation. Wu is a student 

member of the IEEE. 

Cheng-Wen Wu is a professor in the Department 

of Electrical Engineering, National Tsing 

Hua University. He also has served as 

director of the university’s Computer and 

Communications Center and Technology Service 

Center. He was a guest editor of the Journal 

of Information Science and Engineering’s 

special issue on VLSI testing, and the technical program chair of the 

IEEE Fifth Asian Test Symposium. He received the 1996 NTHU 

Teaching Award and the 1997 Outstanding Electrical Engineering 

Professor Award from the Chinese Institute of Electrical Engineers 

(CIEE). Wu received the BSEE from National Taiwan University, 

Taipei, and the MS and PhD, both in electrical and computer engineering, 

from the University of California, Santa Barbara. He is a 

member of CIEE and a senior member of IEEE. 

Tsin-Yuan Chang is an associate professor 

in the Department of Electrical Engineering, 

National Tsing Hua University. His research 

areas include VLSI design and testing, faulttolerant 

computing, and computer arithmetic. 

Chang received the BS from National Tsing 

Hua University and the MS and PhD from 

Michigan State University, all in electrical engineering. Chang is a 

member of the IEEE. 

Send questions and comments about this article to Cheng-Wen 

Wu, Dept. of Electrical Engineering, National Tsing Hua University, 

Hsinchu, Taiwan, ROC; cww@ee.nthu.edu.tw. 

70

A Programmable BIST Core for Embedded DRAM - Laboratory for ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?