Modeling and Temperature Control of Rapid Thermal Processing

Modeling and Temperature Control of 

Rapid Thermal Processing 

Eyal Dassau, Benyamin Grosman and Daniel R. Lewin † 

PSE Research Group, Dept. of Chemical Engineering, Technion I. I. T., 

Haifa 32000, Israel 

Abstract. 

In the past few years, Rapid Thermal Processes (RTP) have gained acceptance as 

mainstream technology for semi-conductors manufacturing. These processes are 

characterized by a single wafer processing with a very fast ramp heating of the silicon 

wafer (up to 200 o C/sec). The single wafer approach allows for faster wafer processing 

and better control of process parameters on the wafer. As feature sizes become smaller, 

and wafer uniformity demands become more stringent, there is an increased demand 

from rapid thermal (RT) equipment manufacturers to improve control, uniformity and 

repeatability of processes on wafers. In RT processes, the main control problem is that of 

temperature, which is complicated due to the high non-linearity of the heating process 

(radiation), process parameters that change significantly during a single wafer process 

and between processes, and difficulties in measuring temperature and edge effects. In 

work carried out in cooperation with Steag CVD Systems, we developed algorithms for 

steady state and dynamic temperature uniformity. These algorithms are designed to 

ensure uniform temperature in RTP equipment. The steady-state algorithm involves the 

reverse engineering of the required power distribution, given a history of past 

distributions and the resulting temperature profile. The algorithm for dynamic 

temperature uniformity involves the development of a first-principles model of the RTP 

chamber and wafer, its calibration using experimental data, and the use of the model to 

develop a controller. 

Keywords: Rapid thermal processing (RTP); Non-linear Model Predictive Control 

(NMPC); Genetic Algorithm (GA); Genetic Programming (GP). 

Submitted to Computers and Chemical Engineering, July 2005. 

† Author to whom all correspondence should be addressed. Email: dlewin@tx.technion.ac.il. http://pse.technion.ac.il 

1

1. INTRODUCTION 

Integrated circuits are at the heart of all electrical appliances. These are based mainly on 

semiconductor devices, which are fabricated in a sequence of batch chemical processes such 

as chemical vapor diffusion (CVD), oxidation, nitration, ion implantation, and annealing. 

Incremental improvements in integrated circuit technology, together with increased 

performance demands from semiconductor devices, have gradually led to requirements that 

the variation in the key quality variables be reduced and to the increased yields afforded by 

larger diameter silicon wafers. This in turn has increased the reliance of the microelectronics 

industry on advanced process control (APC) strategies, and to seek new fabrication methods. 

Thermal processes play an important role in the fabrication of semiconductor chips in 

the microelectronics industry. Shrinking device dimensions to the sub-micron range make 

stringent demands on the thermal processing of semiconductor wafers. The wafer should 

spend the minimal time close to the process temperature to reduce the solid-state diffusion of 

dopants introduced in the previous fabrication steps. The drive to reduce this “thermal 

budget” and the tight quality demands gave birth to a new technology: single wafer 

processing (SWP). SWP systems must heat up and cool down quickly in order to compete 

economically with multi-wafer technology, and this has led to the development of rapid 

thermal processing (RTP). 

RTP involves the processing of single silicon wafers, and is used for various 

processes for the manufacture of semiconductor devices, such as rapid thermal annealing 

(RTA), rapid thermal oxidation (RTO), rapid thermal chemical vapor deposition (RTCVD) 

and rapid thermal nitration (RTN). A typical RTP operating cycle consists of three phases: 

(1) rapid heating to the desired operating temperature, (2) the processing phase, in which 

temperature is held constant, and (3) rapid cooling to ambient conditions. The drive to reduce 

the “thermal budget” makes RTP an attractive alternative to conventional methods of thermal 

processing. This goal forces a stiff constraint on the control of the process temperature and 

thickness uniformity. As feature sizes become smaller, and wafer uniformity demands 

become more stringent, there is an increased demand from rapid thermal (RT) equipment 

manufacturers to improve control, uniformity and repeatability of processes on wafers. In RT 

processes, the main control problem is that of temperature regulation, which is complicated 

due to the high non-linearity of the heating process (radiation), process parameters that 

change significantly during a single wafer process and between processes, and difficulties in 

2

measuring temperature and edge effects. The controller should be able to track ramped set 

point trajectories of between 50 and 200 o C/sec, and subsequently, to maintain a uniform 

temperature across the wafer. The rapid heating is made possible using clusters of high 

powered lamps, with the lamp configuration defining the structure of the RTP system, and the 

number of pyrometers or other temperature measuring techniques defining the character of 

the control configuration that can be implemented on the RTP system. 

To meet the control objectives, a number of alterative approaches have been 

suggested. The proposed strategies involve decoupled decentralized control (Balakrishnan et 

al., 1999), learning control (Yangquan et al., 1997 and Jin Young and Hyun Min, 2001), 

adaptive control (Morales and Dahhou, 1998), Internal model control (IMC: Schaper et al., 

1994), model predictive control (MPC: De Keyser and Donald, 1999), nonlinear MPC 

(NMPC: Breedijk at al., 1993), and quadratic dynamic matrix control (QDMC: Breedijk at 

al., 1994). A review on the state-of-the-art in RTP control is provided by Edgar et al. (2000). 

In this paper, two alternative control strategies are developed for temperature 

uniformity in RTP. The first strategy is applicable in cases where the RTP system has only a 

single temperature measurement positioned at the center of the wafer, and involves the 

optimal selection of two sets of heating lamp zone ratios, one of which is applied in the rapid 

heating stage, and the second in the constant-temperature processing stage. A robustly tuned 

PI controller ensures that the measured center-point temperature is maintained on its setpoint 

during the entire trajectory. The second strategy, applies nonlinear model predictive control 

to regulate the entire RTP temperature trajectory for uniformity. This option can only be 

implemented in RTP systems in which a number of temperature measurements are available, 

but gives superior performance. 

This paper is structured as follows. Section 2 provides a brief description of the 

commercial RTP system setup used in this work. Next, the mathematical model developed to 

describe the process and its calibration is detailed. Sections 4 and 5 describe the algorithm 

developed for temperature uniformity, and its application in a single-loop control strategy. 

Finally, in Section 6, we describe the application of a novel nonlinear model predictive 

controller for temperature uniformity, relying on empirical discrete models generated using 

genetic programming (Grosman and Lewin, 2002). 

3

2. PROCESS DESCRIPTION 

This work was carried out in cooperation with Steag CVD Systems, a vendor of RTP 

processing chambers. Steag’s Integra Pro RTP-CVD system, used in this study and shown 

schematically in Figure 1, involves a heating system consisting of 64 1.5 kW halogen lamps, 

arranged in five concentric banks, each of which can be adjusted independently to assist the 

uniform processing of 8″ wafers. 

Table 1 provides details of the arrangement of the lamps in these five banks, 

henceforth referred to as zones, and their locations. A single pyrometer is positioned at the 

center of the wafer, which sends temperature measurements to a PID controller that 

manipulates the total energy supply to the array of lamps. The total energy supplied in each 

zone therefore depends on the number of lamps in the zone, the fraction of the total power set 

for the zone, and the total power defined by the PID controller. Thus, for example, if zone 

two, consisting of six lamps, is set to have a power fraction of 30%, and the controller sets a 

total power of 50% of the full range, the actual power generated would be 1,500×6×0.3×0.5 = 

1,350 W. 

The reaction chamber is closed from above by a quartz window, which allows for 

radiative heating of the wafer by the heating lamp array, while at the same time permitting 

wafer processing under vacuum. The heating lamps and chamber are cooled to a temperature 

of 15 o C by channels conveying chilled water through the chamber matrix, and the quartz 

window is cooled by a stream of air. The wafer is placed by a robot arm on a support, 

positioned 99.5 mm under the heating array, which spins during processing to enhance 

uniformity. The system is computer-controlled, allowing for automation and data-acquisition. 

In the commercial system, an optical pyrometer provides the only temperature measurement, 

which is positioned underneath the center of the wafer. However, on the experiments carried 

out to enable the development of a mathematical model, the radial wafer temperature profile 

is measured by running the equipment using TC wafers of various emissivities, on which five 

thermocouples are attached radially, at the center, and positioned 2.5, 5, 7.5 and 9.5 cm from 

the center. 

4

Concentric Lamp Array 

Quartz window 

Measuring system - TC Wafer 

Optical Pyrometer 

Figure 1 - Integra Pro RTP-CVD system. 

Table 1. Heating System Arrangement for Steag’s Integra Pro RTP-CVD System. 

Zone Number of Lamps Rin (mm) Rout (mm) 

1 1 0 13.5 

2 6 18.5 45.5 

3 12 50.5 77.5 

4 19 82.5 109.5 

5 26 114.5 141.5 

3. MODELING AND CALIBRATION OF THE RTP SYSTEM 

The first step in achieving a control scheme involves the development of a first-principles 

model of the RTP chamber and wafer. This is calibrated to match experimental data using 

non-linear regression. The dynamic model, expressed as a partial differential equation, has 

been approximated by finite differences. It is solved numerically using the implicit Crank 

Nicholson scheme with some modifications to handle the non-linear temperature terms that 

were included explicitly for simplicity. The subsequent control work relies on this model as a 

surrogate for the real RTP process at Steag CVD Systems. 

Modeling the Wafer. 

An energy balance on the wafer in the RTP chamber gives: 

∂T 

ρC 

= qk 

+ qc 

+ q 

∂t 

where ρ, C and T are the wafer density, specific heat and temperature, t is the time, and qk, qc 

and qr are the heat transfer rates by conduction, convection and radiation, respectively. A 

5 

r 

(1)

number of simplifications can be made the model that describes the specific equipment under 

investigation. Since the system has rotational symmetry, the full three-dimensional model (in 

r, θ and z) can be reduced to a two-dimensional one (in r and z). Furthermore, since during 

processing, the wafer is positioned on a rotating plate, the entire wafer is represented by 

radial chord, leading to a two-dimensional model in Cartesian coordinates (x, z). Furthermore, 

it is assumed that the quartz window and cooling water temperatures are constant. 

Using these assumptions, the energy balance is expressed as: 

∂T 

∂ ⎛ ∂T 

⎞ ∂ ⎛ ∂T 

ρ C ( T) 

= ⎜k 

( T) 

⎟ + ⎜k 

( T) 

∂t 

∂x 

⎝ ∂x 

⎠ ∂z 

⎝ ∂z 

The following boundary conditions apply: 

T = T at t = 0 

init 

∂T 

k ( T ) = 0 at x = 0 

∂x 

∂T 

k( T ) = −he 

wall 

∂x 

( T − T ) at x = R 

4 4 ( T − T ) + h ( T − T ) at z = 0 

∂T 

k( T ) = F1 

ε 1(T) 

σ 

cool w cool 

∂z 

⎛ x ⎞ 

hw = hi 

+ ( ho 

− hi 

) ⎜ ⎟ 

⎝ R ⎠ 

( x,t) 

( x) 

6 

4 

⎞ 

⎟ 

⎠ 

4 4 ( T − T ) at z Z 

∂T 

q 

k( T ) = ε ( T ) − F ( T ) 

(8) 

2ε 

σ 

a = 

∂z 

A 

where T is the wafer temperature, Tinit is the initial wafer temperature, he, hi, ho, and hw, are 

the convective heat transfer coefficients at the wall, at the center and edge of the wafer, and 

the overall coefficient, respectively, Twall is the wall temperature, Tcool is the cooling water 

temperature, and Ta is the temperature of the quartz window, C(T) is the heat capacity, k(T) is 

the thermal conductivity, σ is the Stephan-Bolzman constant, ε1(T) and ε1(T) are the 

emissivities of the lower and upper wafer surface, respectively, F1 and F2 are reflective 

coefficients, x and z are the radial and depth coordinates, Z is the wafer thickness, R is the 

radial chord length, A(x) is the effective wafer area at the chord position x, and q(x, t) is the 

heat transfer rate to a given point at x. 

The initial condition in Eq. (3) defines the initial wafer temperature. The boundary 

condition in Eq. (4) expresses the symmetry at the center of the wafer (x = 0). At the edge of 

the wafer, at x = R, the boundary condition in Eq. (5) relates the conduction in the wafer with 

heat losses to the reactor walls by convection. For the z direction, the boundary condition in 

(2) 

(3) 

(4) 

(5) 

(6) 

(7)

Eq. (6) at z = 0 (i.e., below the wafer), relates the conduction in the wafer to heat losses to the 

surroundings by radiation and convection. The overall convectional heat transfer coefficient 

in Eq. (7) is that proposed by Lord (1988), which accounts for spatial variations. Finally, the 

boundary condition in Eq. (8) at z = Z (i.e., facing the heating lamps), relates the heat transfer 

in the wafer to the heat supplies from the heating lamps and the heat losses to the quartz 

window. 

The wide range of operating temperatures (between 25 and 1200 o C) affects the 

thermal properties of the silicon wafer. Thus, the effect of temperature on the thermal 

conductivity and heat capacity are accounted using the correlations of Borisenko et al. 

(1997): 

-1. 

12 

k( 

T ) = 802. 

99 T 

⎡ W ⎤ 

⎢cmK 

⎥ 

⎣ ⎦ 

300 −1683K 

C 

⎡ J ⎤ 

⎢ ⎥ 

⎣ gK ⎦ 

300 

The correlation of Virzi (1991) is employed to describe emissivity: 

−4 

( T ) = 0. 

641+ 

2. 

473× 

10 T 

> K 

ε( 

T ) = 0. 

2662 + 1. 

8591T 

7 

−0. 

1996 

e 

25 

1. 

0359× 

10 

- 

8. 

8328 

T 

Since the wafer density does not depend strongly on temperature, it was taken as constant (ρ 

= 2,330 kg/m 3 ). Furthermore, the weak temperature dependence of the thermal conductivity 

and the homogenous nature of the silicon wafer allow the energy balance in Eq. (2) to be 

further approximated: 

Modeling Heat Transfer to the Wafer. 

2 2 

∂T 

⎛ ∂ T ∂ T ⎞ 

ρ C( 

T) 

= k( 

T ) ⎜ + ⎟ 

2 2 

∂t 

⎝ ∂x 

∂z 

⎠ 

The main mechanism that raises the wafer temperature to the desired processing level 

is radiation from the lamp array. The lamp array, which is located directly above the wafer, is 

arranged in five concentric rings of heating zones. Heat transfer by radiation depends on the 

radiation-transfer medium, its wavelength, and the system geometry. An ideal model for heat 

transfer by radiation in RTP needs to account for both diffusive and reflective radiation heat 

transfer. However, for control purposes, the heat transfer mechanism can be significantly 

simplified. Firstly, the radiating body is assumed to be a diffusive gray system, meaning that 

the surface emissivity, ε, 

and observativity, α, do not depend on the ray direction. A gray 

body is a body whose emissivity and observativity are independent of the wavelength, but 

may be functions of temperature. This means that each surface will radiate as a black body at 

(9) 

(10) 

(11) 

(12)

all wavelengths depending only on its temperature (Siegel, 1981). Secondly the lamp power 

must be related to the heat flux transmitted to the wafer, expressed in terms of view factors. 

The view factor defines the radiation fraction which is transferred from one surface to the 

other and is deriving from the system geometry using the following equation: 

F 

= 

1 

∫∫ 

cosθ 

cosθ 

dA dA 

1 2 

1− 

2 

2 2 1 

(13) 

A1 

πS 

AA 1 2 

where F1-2 is the radiation fraction transmitted from surface 1 to surface 2, A1 and A2 are the 

surface areas respectively, θ1 and θ2 are the normal angles at the surfaces, and S is the 

distance between the surfaces. The literature abounds with equations of view factors for 

different bodies and systems. In this model, the view factors connect the lamp array to a 

differential piece of wafer, which is expressed in terms of a differential definition of the 

radiation view factor: 

cosθ1 

cosθ 

2 

dF1− 

2 = 

dA 

2 

2 

(14) 

πS 

The lamp array is divided in a natural way to five heating rings where each ring radiates to 

the wafer slice as shown in Figure 2: 

x 0 

z 0 

(x,y,z) 

S 

y 0 

(x 0 ,y 0 ,z 0 ) 

8 

rout 

rin 

(b) (a) 

Figure 2 – The system geometric for view factor calculation. (a) Differential annular ring from rin to rout. 

(b) The plane (x,y,z) is the lamp array plan where (x 0 , y 0 ,z 0 ) is the wafer plane. 

By integrating Eq. (13) on a differential annular heating ring for each ring the following 

relation is found: 

F 

1−d 

2 

⎛ 

⎜ 

⎜ 

1 

= 

⎜ 

2 ⎜ 

⎜ 

⎜ 

⎝ 

x 

2 

2 2 2 

( x + z + r ) 

in 

+ z 

2 

1− 

− r 

2 

in 

4x 

2 2 

rin 

2 2 2 

( x + z + r ) 

in 

2 

− 

x 

2 

2 2 2 

( x + z + r ) 

out 

+ z 

2 

1− 

x 

− r 

2 

out 

4x 

y 

2 2 

rout 

⎞ 

( ) ⎟⎟⎟⎟⎟⎟ 

2 2 2 2 

x + z + r 

Thus, the heating ring power is related to the heat flux to a differential wafer slice: 

out 

⎠ 

(15)

q( x, 

t) 

= α ⋅ 

5 

∑ 

j= 

1 

F 

j− 

x 

( x, 

r , r 

9 

in 

out 

) ⋅ q( 

j) 

where j is the ring number, q(x,t) represent the heating ring powers multiplied by their view 

factor and α is a tunable parameter that is calibrated against the experimental system at Steag, 

which determines the radiation heat transfer that is not diffused. 

Model solution 

The model was discretized using finite difference approximations and solved 

numerically using the implicit Crank Nicholson method. However, nonlinear terms 

introduced by radiation in the boundary conditions and temperature dependence of the 

process parameters would significantly increase the computation time. To avoid these 

problems the nonlinear terms appear explicitly in the approximations used (Haimovich, 

2000). 

Model Calibration 

To enable the use of the developed model for improved design and control of the 

Steag RTP setup, several of the parameters in the model need to be calibrated to match 

existing process conditions. The parameters are the heat transfer coefficients, top and bottom 

emisivity, the F1 and F2 reflective coefficients and α, the tunable parameter that accounts for 

undiffused radiation heat transfer. Since our objective is to use the model to drive the 

controller for the lamp power during the heating and the process portions of a processing 

cycle, and not the cooling stage that involve gas flow to the process for cooling, the heat 

transfer by convection is neglected and these parameters are set to zero in the model. The 

parameters were calibrated by non-linear regression using a genetic algorithm (GA) as 

described by Lewin (1996). The GA allows a wider initial population in the optimization than 

a conventional optimization with single initial guess for each tunable parameter. Figure 3 

shows comparisons between the model and experimental data at the five wafer locations for a 

typical run, for a desired set point temperature of 750 o C. The GA was driven by a desire to 

reduce the summed squared prediction error, computed using: 

1 

( ) 2 

= ∑ , − mod, 

= 1 

n 

SSE T dat i T i 

n i 

where SSE is the summed squared error, T dat, i is the i'th temperature measurement, mod,i 

(16) 

(17) 

T is 

the i'th temperature as predicted by the model, and n is the number of measurements. The

model fit is acceptable, with the largest SSE being less then 1.6, equivalent to an average 

magnitude of one degree C. 

T [K] 

T [K] 

T [K] 

1200 

1000 

800 

TC. # 1 

600 

0 20 40 60 

t [sec] 

80 100 120 

1200 

TC. # 3 

1000 

800 

600 

0 20 40 60 80 100 120 

t [sec] 

1200 

TC. # 5 

1000 

800 

600 

0 20 40 60 

t [sec] 

80 100 120 

10 

T [K] 

T [K] 

1200 

1000 

800 

TC. # 2 

(1) (2) 

(3) 

(5) 

600 

0 20 40 60 

t [sec] 

80 100 120 

1200 

TC. # 4 

1000 

800 

600 

0 20 40 60 

t [sec] 

80 100 120 

Model 

Data 

Figure 3 – Model results against experimental data: (1) Temperature measurement at 9.5 cm from wafer 

center; (2) Temperature measurement at 7.5 cm from wafer center; (3) Temperature measurement at 5 

cm from the wafer center; (4) Temperature measurement at 2.5 cm from the wafer center: (5) 

Temperature measurement at the wafer center. 

Figure 4 presents the normalized temperature profile against time at the five measurement 

points. These results show the same non-uniformity as was seen in the real RTP system both 

in the rapid temperature raise and the steady state. 

(4)

|∆ T| = | Tc(i)-Tc(1) | [ o C ] 

12 

10 

8 

6 

4 

2 

0 

0 10 20 30 40 50 

Time [sec] 

60 70 80 90 100 

Figure 4 – Temperature transients at the measurements points. 

Model Validation 

11 

Tc(1) 

Tc(2) 

Tc(3) 

Tc(4) 

Tc(5) 

The last step in approving a model of a physical system is to test whether this model 

shows the same dynamic behavior as expected from a study of the literature. This was 

accomplished by running several step test simulations to estimate the linearized process gain, 

with results presented in Figure 5. Schaper et al. (1994) and Kailath et al. (1996) observe that 

the process gain decreases with temperature rise and indeed the RTP model shows this 

relation. Furthermore, linearizing the RTP system dynamics leads to a first order model, in 

which the process gain is inversely proportional to the third power of temperature. Indeed, as 

seen in Figure 5, the step test results indicate the empirical relationship: 

K = -3.664⋅ log ( ) + 32.86 

(18) 

p 

T 

The small power difference between the empirical and the expected value of −3 is explained 

by system nonlinearities, other simplifications and measurement error.

log ( K p ) 

8.2 

8 

7.8 

7.6 

7.4 

7.2 

7 

6.8 

6.6 

6.4 

6.2 

6.75 6.8 6.85 6.9 6.95 7 7.05 7.1 7.15 7.2 7.25 

log (T) 

Figure 5 – The effect of temperature on process static gain. 

. 

4. ALGORITHM FOR TEMPERATURE UNIFORMITY 

12 

R 2 = 0.9747 

log(T) 

linear 

In the following, a solution for the temperature uniformity problem is suggested, and 

demonstrated in concert with a feedback control scheme on a simulation of the RTP 

equipment at Steag CVD Systems. This solution is designed to ensure uniform CVD of 

substrates grown in RTP equipment. Our uniformity algorithm involves the reverse 

engineering of the required power distribution, given a history of past distributions and the 

resulting temperature profile. The algorithm has been realized in MATLAB ® , and a userfriendly 

GUI has been developed to make it easy to use. 

The uniformity algorithm is based on a linear approximation formulated in terms of 

deviation variables of the temperature, and the power from a base case profiles that are 

supplied by the user. Since we are using deviation variables and we must set five power 

distributions, the user must supply at least six (independent) sets of data. The target 

temperature profile is set according to the following: 

= 

T − T 

(19) 

T sp 

0 

0

where T sp is the target profile, T 0 base case temperature profile and T 0 is the average of 

T 0 . The target temperature profile was defined in this way to account for temperature non- 

uniformity in the base case. We assume a linear model of the following structure: 

Y = C ⋅ P 

[ T 1 − T 0 � T 2 − T 0 �… 

T T 0 ] 

[ P − P � P − P �… 

P P ] 

Y = � n − 

P = � n − 

1 

0 

2 

where Y is matrix of temperature profiles in terms of deviation variables, P is the power 

distribution matrix in terms of deviation variables and C is the linear model coefficient 

matrix that is computed by multiple linear regression: 

T 

−1 

T T 

( ( ) ) 

C = P P P Y 

where T is the matrix transpose. Having estimated a linear approximation relating power 

distribution to temperature profiles, the next step is the minimization of an objective function, 

selected according to how the wafer was measured. Thus, for data measured radially, the 

objective function needs to be suitably weighed: 

2 2 

∑ ( r i − r i−1 

)( C ⋅ P opt − T ) 

min sp 

Popt 

Alternatively, if elipsometrically-measured data is used, no weighting is required: 

2 

min ( C ⋅ P opt − T sp ) 

Popt 

∑ 

In Eqs. (24) and (25), P opt is the optimal power distribution. Since Steag required that the first 

measurement point measured by the pyrometer should be equal to the base profile, the 

optimization is carried out on four power values, with the fifth being fixed using the linear 

model. 

Algorithm validity 

The basic assumption behind the algorithm is that a linear model based on 

temperature and power deviation variables will fit the operating conditions. Since the process 

is in fact nonlinear, the optimized power distribution obtained in the first iteration is only an 

approximation, and will probably result in a temperature profile that is not sufficiently 

uniform. However, running the algorithm iteratively, and adding new information as it 

becomes available, will lead to convergence. Figure 6 shows the convergence of the 

algorithm around set point temperature of 700 o C in six runs with a STD of 2 as a stopping 

condition. 

0 

13 

T 

2 

0 

(20) 

(21) 

(22) 

(23) 

(24) 

(25)

∆T = T set point -T mean [ o C] 

10 

5 

0 

1 2 3 4 5 

0 

6 

Number of Iterations 

Figure 6 – Algorithm convergence. 

The flow diagram for the proposed algorithm is shown in Figure 7, noting that the same 

algorithm can accept any other measurement which represents wafer uniformity as an input, 

(e.g., elipsometer readings). 

14 

8 

6 

4 

2 

STD

Display optimal 

power 

Figure 7 – Flow diagram of algorithm. 

5. SINGLE-LOOP CONTROL 

Process 

data and 

power 

Number of data 

sets 

Radial or spiral 

Measurement 

Insert or load data 

files and power 

distribution 

Is the data 

singular ? 

No 

Calculate mean 

and STD 

Auto set base 

Is the base o.k. 

Yes 

Generate T sp 

vector 

Generate Y and P 

perturbation 

matrixes and 

Calculate C matrix 

Start optimization 

15 

Yes 

No 

Replace the i th 

data row 

Set base 

A simulator of the Steag RTP system was developed using MATLAB ® and SIMULINK ® to 

assist in the controller design and testing. The Steag control system relies on a PID controller, 

which controls the total power to the lamp array (0-100%), with the power distribution of the

heating zones being prespecified. To improve on the performance of the linear controller, the 

system is actually run in open loop until the center point temperature attains a temperature 

referred to as "cut-back low," at which point the controller is activated to bring the wafer 

center point temperature to the set point. It is noted that with this strategy, the predefined 

zone ratio is the only means to attain temperature uniformity. 

Controller Design 

The purpose of the controller is to assure temperature uniformity in the RTP system 

mainly in the fast heating the ramp and at temperature steady-state, with the main 

requirements being to minimize the overshoot and rise time of the trajectory. The controller 

was designed based on Internal Model Control principles (Rivera et al., 1986). A PI 

controller was selected since this system has a relatively small delay time relative to the 

characteristic time (a delay of 0.2 sec and a characteristic time of 16 sec are typical). In such 

cases, there are no advantages in employing PID control. The solution to the uniformity 

problem was addressed in two steps: (a) The PI controller was designed to bring the center 

temperature to its set point; (b) two distinct optimal heater power zone distributions were 

generated: one for the fast ramp and one for the operating temperature. These power 

distributions were optimized using the proposed uniformity algorithm. 

Control system tuning 

The algorithm for the tuning of the control system is shown in Figure 8. This provides 

uniform temperature profile in the fast heating zone and in the process temperature, and is 

divided into three stages: 

(a) PI Controller tuning is defined, including controller parameters such as open loop 

power, cut back low, temperature set point, initial zone ratio, process time, zone ratio 

temperature switch and controller parameters. After completing this step, the center 

point of the wafer will be at the set point. 

(b) The uniformity algorithm is invoked to ensure temperature uniformity in the process 

operating temperature (at the temperature steady-state). 

(c) The uniformity algorithm is again invoked, to derive the optimal power distribution in 

the fast ramp, to further improve the temperature uniformity in the fast ramp and the 

transient from the open loop to the close loop control by reactivating the uniformity 

16

algorithm to find optimal zone ratio for the open loop part of the process. The following 

case studies illustrate how the algorithm works at various desired temperature levels. 

Zone Ratio 

Switch 

Initial Zone Ratio 

Tune the 

controller 

Run simulation 

No 

Is the STD < defined 

Is the over 

shoot too large 

Yes 

Yes 

Yes 

No 

Temperature Set 

Point 

No 

Invoke the 

uniformity tool on 

the steady state 

temperature 

profile 

New steady state 

zone ratio 

Set the s.s. and 

ramp zone ratio 


Is the STD < defined 

Raise the cut 

back low 

Open Loop 

Power 

Start simulation 

Is the 

controller 

tuned 

Is the target 

temperature profile 

uniform 

Define Stopping 

condition on the STD 

Invoke the 

uniformity tool 

on the knee 

temperature 

profile 

New ramp zone 

ratio 

Set the ramp 

zone ratio 


Figure 8 – Flow diagram calibrating the control system. 

No 

No 

17 

Cut Back Low 

Yes 

Define Stopping 

condition on the STD 

No 

Yes 

Process time 

Is the over 


Is the over 


No 

Is the knee area 

temperature profile 

uniform 

Yes 

(1) 

Yes 

(2) 

Controller 

parameters 

Yes 

(3) 

No 

Raise the cut 

back low 

Raise the cut 

back low 

Done

Case study – Process temperature target of 700 o C 

T [ o C] 

T [ o C] 

T [ o C] 

800 

600 

400 

200 

0 

0 10 20 30 40 50 60 70 80 90 100 

800 

t [sec] 

600 

400 

200 

0 

0 10 20 30 40 50 60 70 80 90 100 

800 

t [sec] 

600 

400 

200 

0 

0 10 20 30 40 50 

t [sec] 

60 70 80 90 100 

Figure 9 – Base temperature profile for set point of 700ºC: (1) Base profile; (2) Profile after setting the 

steady state zone ratio; (3) Profile after setting the ramp zone ratio. 

In this case study, the temperature set point for the process is 700 o C , the transient 

temperature from open to close loop is 680 o C and arbitrary zone ratio powers of 

(zr1=0.0446, zr2=0.827, zr3=0, zr4=0.387, zr5=0.95). The first plot in Figure 9 shows the 

initial profile of the system which, as can be seen, exhibits uniformity problems in the target 

temperature as well as in the fast heating zone. An improved temperature distribution can be 

seen in the second plot of Figure 9, which shows the result of implementing a new zone ratio 

of (zr1=1, zr2=0.408, zr3=0, zr4=0.795, zr5=1) as calculated by the uniformity algorithm. This 

new trajectory achieves a uniformity with an STD of less than 2, as seen also in Figure 10. 

This solves the uniformity problem in the steady-state temperature region, but there is still a 

problem in the fast heating stage (the transient from open to closed loop), this problem is 

resolved by re-running the uniformity algorithm for the fast heating and setting the zone ratio 

powers to (zr1=0.17, zr2=0.483, zr3=0, zr4=0, zr5=1) and the transient temperature (from open 

to closed loop) to 650 o C. The proposed control presents an improved temperature profile as 

can be seen in the third plot of Figure 9, which indicates that the uniformity of the 

18 

(1) 

(2) 

(3)

temperature trajectories are indeed significantly improved, but at a cost of a longer start-up 

time. 

∆T = T set point -T mean [ o C] 

10 

5 

0 

1 2 3 4 5 

0 

6 

Number of Iterations 

Figure 10 – Convergence of the algorithm for set point of 700ºC: Temperature STD (right) and mean 

temperature deviation from the set point (left). 

This control strategy, combining a PI controller that is tuned for a specific temperature set 

point and a uniformity algorithm, was tested over a wide range of temperature set points with 

good results. The main advantages are that the operator needs to perform only few iterations 

to tune the controller to a desire operating temperature, where the trade off between 

uniformity and the time that is needed to reach the set point is a degree of freedom that 

depends on the process. Furthermore, this control scheme can be easily implemented on a 

system that has only one temperature measurement, which still providing acceptable 

performance. 

6. MULTIVARIABLE CONTROL 

To improve the control of RTP systems and to meet tighter uniformity specifications, to 

reduce the time needed to acquire the set point, and otherwise improve the flexibility of the 

process, there is a need for control systems more advanced than that that developed in the 

19 

8 

6 

4 

2 

STD

previous section. However, such a system will require more than one temperature 

measurement. To enable the control system to work at different process temperatures without 

any need for retuning, we implemented a non-linear model predictive control (NMPC) 

system based on non-linear models derived using Genetic Programming (GP), as described 

by Grosman and Lewin (2002). The inputs for the controller are three temperature 

measurements on the wafer radius. In a real system, these would have to be optical 

pyrometers, as the temperature measurement should not interfere with the wafer rotation. 

Model Formulation: The first step was to formulate GP models for MPC. In our study, these 

models were based on the mathematical model of the Steag RTP system which represents the 

real system. By investigating the mathematical model, a response time in the order of a few 

seconds were identified, leading to the sampling time selection of half a second. This 

sampling rate will provide the necessary information for constructing the GP models. The 

system was excited by 12 steps of five seconds in the inputs, in this case, the power ratio for 

the heating rings assuming full total power. However, since it is of interest to bring the 

system to a given temperature range, this calls for a specific heat loading. Thus, the 

perturbation steps are arranged in sets of threes where the first perturbation in each set is free 

to activate all the heating rings while limiting the fourth and fifth heating rings to 15% of full 

power. In the second and third perturbations, only one heating ring is activated randomly. 

Figure 11 shows a typical perturbation sequence derived for the GP and Figure 12 presents 

the resulting temperature profiles obtained. 

20

% Power 

% Power 

% Power 

% Power 

% Power 

1 

0.5 

0 

1 

0 10 20 30 40 50 60 

0.5 

0 

1 

0 10 20 30 40 50 60 

0.5 

0 

1 

0 10 20 30 40 50 60 

0.5 

0 

1 

0 10 20 30 40 50 60 

0.5 

0 

0 10 20 30 

t [sec] 

40 50 60 

Figure 11 – Typical perturbation of the RTP for the GP. (1) – (5) different heating rings. 

T [ o C] 

1000 

800 

600 

400 

200 

0 

0 10 20 30 

t [sec] 

40 50 60 

Figure 12 – Temperature profile resulting from the system perturbation. 

21 

(1) 

(2) 

(3) 

(4) 

(5) 

x = 0 

1 

x = 0.01 

2 

x = 0.02 

3 

x = 0.03 

4 

x = 0.04 

5 

x = 0.05 

6 

x = 0.06 

7 

x = 0.07 

8 

x = 0.08 

9 

x = 0.09 

10 

x = 0.1 

11

The GP models and their prediction against data are presented below: 

T [ o C] 

1. Model for the center of the wafer: 

1000 

900 

800 

700 

600 

500 

400 

300 

200 

100 

T 

1 

+ 1 

( 

() i = zr4 

( i −1) 

+ 1. 

72 ⋅ ( zr2 

( i −1) 

+ 1. 

00045 ⋅ zr3 

( i) 

) + 

. 0766 ⋅ zr ( i −1) 

) ⋅15. 

2924 + 0. 

97707 ⋅T 

( i −1) 

+ 12. 

9392 

5 

0 

0 20 40 60 

t [sec] 

80 100 120 

Figure 13 – The behavior of the model for the center point against the process data. 

22 

1 

Data 

Model 

(26)

T [ O C] 

2. Model for the second measurement point (five centimeters from the center): 

900 

800 

700 

600 

500 

400 

300 

200 

100 

T 

2 

() i = ( 

zr ( i −1) 

+ 1. 

2417 ⋅ zr ( i −1) 

+ 1. 

0959 ⋅ zr ( i − 2) 

+ 

+ 1. 

1334 ⋅ zr ( i −1) 

Data 

Model 

4 

5 

3 

) ⋅18. 

2151 + 0. 

97785 ⋅T 

( i −1) 

+ 11. 

8499 

0 

0 20 40 60 

t [sec] 

80 100 120 

Figure 14 – The behavior of the model for the second point against the process data. 

23 

2 

2 

(27)

T [ o C] 

3. Model for the third measurement point (9.5 centimeters from the center): 

T 

3 

() i = ( 

zr ( i −1) 

+ 

+ 1. 

7898 ⋅ zr ( i −1) 

800 

700 

600 

500 

400 

300 

200 

100 

3 

5 

Data 

Model 

0. 

532122 

⋅ zr ( i − 

2 

) ⋅16. 

9627 + 0. 

97737 ⋅T 

( i −1) 

+ 10. 

4339 

24 

2) 

+ 

3 

2. 

0857 

⋅ zr ( i −1) 

+ 

0 

0 20 40 60 

t [sec] 

80 100 120 

Figure 15 – The behavior of the model for the third point (wafer edge) against the process data. 

As can be seen, the GP models are in excellent agreement to the simulated data. It should be 

noted that, following Grosman and Lewin (2002), the model predicts ten data points ahead 

from one set of known inputs. Hence at every tenth data point, the model is reset to fit the 

simulated data and than used to predict the subsequent ten points. Furthermore, it is 

interesting to note that the GP eliminates the center heating ring from the model, probably 

because it has a negligible effect on the response of the system, since the center ring consists 

of a single lamp. 

NMPC for the Steag RTP System: The NMPC objective function for this application is: 

4 

(28)

J = S 

+ S 

1 

+ S 

+ S 

4 

⋅ 

⋅ 

⋅ 

6 

n 

⋅ 

∑ 

i= 

1 

n 

∑ 

i= 

1 

m 

∑ 

m 

∑ 

i= 

1 

m 

2 

( T () i −T 

() i ) + S ⋅ T () i −T 

() i 

1sp 

m 

2 

∑( 2 2sp 

) + S8 

⋅∑( 

T 3() 

i −T 

3sp 

() i ) 

i= 

1 i= 

1 

2 

2 

( zr () i − zr ( i −1) 

) + S ⋅ ( zr () i − zr ( i −1) 

) + S ⋅ ( zr () i − zr ( i −1) 

) 

1 

∑ 

i= 

1 

2 

( zr () i − zr ( i −1) 

) + S ⋅ ( zr () i − zr ( i −1) 

) 

4 

1 

1 

4 

7 

2 

5 

∑ 

i= 

1 

2 

2 

2 

[ ( T 3() 

i −T 

1() 

i ) + ( T 3() 

i −T 

2() 

i ) + ( T 2() 

i −T 

1() 

i ) ] 

9 

i= 

1 

where S1-S9 are weighting functions, 1 , T 2, 

T 3 

n 

n 

2 

5 

25 

2 

5 

2 

+ 

3 

n 

∑ 

i= 

1 

3 

3 

2 

+ 

2 

+ 

(29) 

T are temperature vectors, T 1 sp , T 2sp 

, T 3sp 

are 

temperature set point vectors and the index i is the i’th sample point. It is noted that the 

objective function consists of three main parts: (a) the sum of squared errors (SSE) between 

the controlled variables and their set points, (b) the SSE of the controller movements, and (c) 

the SSE of the difference between the three measured temperatures. 

The prediction and control horizon are set to five and two, and the objective function 

weights are set empirically to the values listed in Table 2. 

Table 2– Weights values for the NMPC 

Weight 

Value 

S1 

0 

S2 

6 

7× 

10 

S3 

6 

6× 

10 

S4 

6 

9× 

10 

S5 

6 

5× 

10 

Case study – Process temperature target of 700 o C 

As in SISO control using PI control integrated with the uniformity algorithm, the 

NMPC is tested on three temperature operating points where the target was to arrive to this 

set point as fast as possible with a uniform temperature profile. Figure 16 shows an example 

of the NMPC performance for a set point of 700 o C. The main benefit from NMPC is a faster 

convergence to the temperature target, which has a direct affect on the overall thermal budget 

of the wafer. Furthermore, the deviation of the radial temperature profile is only of the order 

of one o C. The small overshot of approximately 30 o C can be overcome by changing the 

objective function weights. The fact that one can minimize overshot while keeping the radial 

temperature profile uniform on different working temperatures without any need for retuning, 

is one of the advantages of NMPC. 

S6 

70 

S7 

70 

S8 

70 

S9 

400

T [ o C] 

∆u 

800 

600 

400 

200 

0 

0 10 20 30 

t [sec] 

40 50 60 

1 

0.8 

0.6 

0.4 

0.2 

0 

0 10 20 30 

t [sec] 

40 50 60 

26 

(1) 

(2) 

zr(1)=0 

zr(2) 

zr(3) 

zr(4) 

zr(5) 

Figure 16 – GP-NMPC control for set point tracking of 700 ºC: (1) Temperature profile in the wafer; 

(2) Controller moves. 

7. CONCLUSIONS 

Two distinct solutions are presented in this work: 

The first one, which could be implemented directly on the Steag CVD RTP system, 

involves the implementation of the uniformity algorithm and an IMC-tuned PI controller. The 

operating sequence calls first for a heating stage in open loop mode until a pre-defined 

temperature, at which point the feedback controller takes over. It has been observed that 

significant temperature uniformity occurs both at the processing temperature and during the 

fast ramp, or more precisely, in the knee region between the open loop and the close loop 

phases. Our solution utilizes the uniformity algorithm to set different zone ratios for the 

process region and for the ramp stage. The switch between the zone ratios is made at a predefined 

temperature that shifts to reduce the heating rate if the temperatures profile shows 

unacceptable overshoot. The set point tracking is achieved by the PI controller that brings the 

wafer center point to the set point temperature. By using different zone ratios, the overall

temperature uniformity is kept at ± 2 ºC of the set point. This solution gave acceptable 

performance at three distinct operating temperatures. 

The second solution involves non-linear model predictive control (NMPC) based on 

genetic programming (GP). We have decided to control three points on the wafer: the first is 

at the center point as in the standard control scheme, the second is positioned five centimeters 

from the center point on a radial line, and the last one is at the wafer edge. These three points 

were picked relying on the observation that the highest non-uniformity is located near the 

wafer edge. On the real system, all of these temperature measurements would be executed 

using pyrometers, in such a way that the rotation of the wafer, common in many RTP 

systems, will not affect the measurement. The strength of this approach is that the same set of 

tuning parameters can control the RTP system at a range of operating temperature set points 

with a very short rise time to the set point and a uniform temperature profile. Although we 

have experienced overshoot, it has been observed only for a few seconds and the set point 

was maintained accurately. 

The two approaches have great potential for resolving real engineering problems 

associated with RTP. The simple SISO technique was developed taking into account the 

limitation associated with the existing RTP equipment at Steag CVD Systems, and relies on a 

single on-line temperature measurement and PID control. In contrast, has been demonstrated 

that a nonlinear multivariable approach can significantly improve performance, but relies on 

additional on-line temperature measurements. Together, they provide a RTP control package 

that represents the state-of-the-art. 

REFERENCES 

Balakrishnan, K. S., S. Shooshtarian, N. Acharya, P. J. Timans and R. P. S. Thakur 

(1999).“Dynamic uniformity control in a rapid thermal processing system,” Advances 

in Rapid Thermal Processing. Proceedings of the Symposium. (Electrochemical Society 

Proceeding Vol.99-10). Electrochem. Soc.,99-10, 399-406. 

Borisenko, V. E. and P. J. Hesketh (1997). Rapid thermal processing of semiconductors. 

Perseus Publishing, Cambridge. 

Breedijk, T., T. F. Edgar and I. Trachtenberg (1993).“A model predictive controller for 

multivariable temperature control in rapid thermal processing,” Proceedings of the 

1993 American Control Conference, AACC, 3, 2980-4, Evanston, IL, USA. 

Breedijk, T., T. F. Edgar and I. Trachtenberg (1994).“Model-based control of rapid thermal 

processes,” Proceedings of the 1994 American Control Conference . IEEE.,1, 887-91, 

New York, NY, USA. 

27

De Keyser, R. and J. Donald, III (1999).“Model based predictive control in RTP 

semiconductor manufacturing,” Proceedings of the 1999 IEEE International 

Conference on Control Applications.,2, 1636-41. 

Edgar, T. F., S. W. Butler, W. J. Campbell, C. Pfeiffer, C. Bode, S. B. Hwang, K. S. 

Balakrishnan and J. Hahn (2000). “Automatic control in microelectronics 

manufacturing: practices, challenges, and possibilities,” Automatica, 36(11), 1567-603. 

Grosman, B. and D. R. Lewin, “Automated Nonlinear Model Predictive Control using 

Genetic Programming,” Comput. Chem. Eng., 26(4-5), 631-640 (2002). 

Haimovich, N. (2000). "Oxidation-Oven Physical Model,” B.Sc. Final Year Project, 

Technion I.I.T, Haifa 

Jin Young, C. and D. Hyun Min (2001). “A learning approach of wafer temperature control 

in a rapid thermal processing system,” IEEE Transactions on Semiconductor 

Manufacturing, 14(1), 1-10. 

Kailath, T., C. Schaper, Y. Cho, P. Gyugyi, S. Norman, P. Park, S. Boyd, G. Franklin, K. 

Saraswat, M. Moslehi and C. Davis (1996). "Control for advanced semiconductor 

device manufacturing: A case history," The Control Handbook. W. S. Levine. 

Lewin, D. R. (1996). “Multivariable feedforward control design using disturbance cost maps 

and a genetic algorithm,” Computers & Chemical Engineering, 20(12), 1477-89. 

Lord, H. A. (1988). “Thermal and stress analysis of semiconductor wafers in a rapid thermal 

processing oven,” IEEE Transactions on Semiconductor Manufacturing, 1(3), 105-14. 

Morales, S. and B. Dahhou (1998). “Temperature uniformity in RTP using MIMO adaptive 

control,” International Journal of Adaptive Control & Signal Processing, 12(3), 227- 

245. 

Rivera, D. E., S. Skogestad and M. Morari (1986). “Internal Model Control for PID 

Controller Design,” I&EC Proc. Des. Dev., 25, 252-265. 

Schaper, C. D., M. M. Moslehi, K. C. Saraswat and T. Kailath (1994). “Modeling, 

identification, and control of rapid thermal processing systems,” Journal of the 

Electrochemical Society, 141(11), 3200-9. 

Siegel, R. (1981). Thermal radiation heat transfer. Hemisphere Pub. Corp. Edition: 2nd ed., 

Washington. 

Virzi, A. (1991). “Computer modelling of heat transfer in Czochralski silicon crystal 

growth,” Journal of Crystal Growth, 112(4), 699-722. 

Yangquan, C., X. Jian-Xin and W. Changyun (1997). “A high-order terminal iterative 

learning control scheme RTP-CVD application,” Proceedings of the 36th IEEE 

Conference on Decision and Control, 4, 3771-2 vol. 

28

Modeling and Temperature Control of Rapid Thermal Processing

Create successful ePaper yourself

Delete template?

Save as template?