12.01.2015 Views

System Programming

System Programming

System Programming

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>System</strong> <strong>Programming</strong><br />

Chapter 5<br />

Compiler<br />

1


Compiler<br />

2


Basic Compiler Functions<br />

Grammars<br />

Lexical Analysis<br />

Syntactic Analysis<br />

Code Generation<br />

3


Terminology<br />

• Statement ( 敘 述 )<br />

–Declaration, assignment containing expression ( 運 算 式 )<br />

• Grammar ( 文 法 )<br />

–A set of rules specify the form of legal statements<br />

• Syntax ( 語 法 ) vs. Semantics ( 語 意 )<br />

–Example: assuming I, J, K: integer and X,Y: float<br />

–I:=J+K vs. X:= I+Y<br />

• Compilation ( 編 譯 )<br />

–Matching statements (written by programmers) to structures<br />

(defined by the grammar) and generating the appropriate<br />

object code<br />

4


Basic Compiler<br />

•Lexical analysis - scanner<br />

–Scanning the source statement, recognizing and<br />

classifying the various tokens<br />

•Syntactic analysis - parser<br />

–Recognizing the statement as some language<br />

construct.<br />

–Construct a parser tree (syntax tree)<br />

•Code generation –code generator<br />

–Generate assembly language codes<br />

–Generate machine codes (Object codes)<br />

5


High-Level <strong>Programming</strong> Language<br />

• A high-level programming language is described in terms of a<br />

grammar, which specifies the syntax of legal statements.<br />

– An assignment statement:<br />

• a variable name + an assignment operator + an expression<br />

6


Grammars<br />

•A grammar for a language is a formal<br />

description of the syntax.<br />

–The grammar does not describe the semantics<br />

(meaning) of the various statement.<br />

•Example: I, J, K: integer and X,Y: float<br />

–I:=J+K vs. I:= X+Y<br />

–Identical syntax<br />

–Different semantics<br />

• integer arithmetic operation<br />

• Floating-point addition<br />

–Very different sequences of machine instructions<br />

• Recognized during code generation<br />

8


BNF (Backus-Naur Form)<br />

• A simple and widely used notations for writing grammars<br />

introduced by John Backus and Peter Naur in about 1960.<br />

• A BNF grammar consists of a set of rules, each of which defines<br />

the syntax of some construct in the programming language.<br />

• Meta-symbols of BNF:<br />

– ::= "is defined as"<br />

– | "or"<br />

– < > angle brackets used to surround non-terminal symbols<br />

• Entries not enclosed in angle brackets are terminal symbols of the grammar<br />

(i.e., token).<br />

• A BNF rule defining a nonterminal has the form:<br />

– nonterminal ::= sequence_of_alternatives consisting of strings of terminals<br />

(tokens) or nonterminals separated by the meta-symbol<br />

9


Simplified Pascal Grammar<br />

Recursive rule<br />

10


Parse Tree<br />

(Syntax Tree)<br />

READ(VALUE)<br />

VARIANCE:=SUMSQ DIV 100<br />

–MEAN*MEAN<br />

The multiplication and division<br />

precede the addition and<br />

subtraction<br />

12


Parse Tree<br />

•If there is more than one possible parse tree for a<br />

given statement, the grammar is said to be<br />

ambiguous.<br />

13


Parse Tree<br />

14


Parse Tree<br />

15


Scanner<br />

•Recognize keywords, operators, integers,<br />

floating-point numbers, character strings and<br />

identifiers.<br />

•The exact set of tokens to be recognized depends<br />

on the programming language be compiled.<br />

16


Lexical Analysis<br />

•Function<br />

–Scanning the program to be compiled and<br />

recognizing the tokens that make up the source<br />

statements.<br />

•Tokens<br />

–Tokens can be keywords, operators, identifiers,<br />

integers, floating-point numbers, character strings, etc.<br />

–Each token is usually represented by some fixedlength<br />

code, such as an integer, rather than as a<br />

variable-length character string (see Figure 5.5)<br />

–Token type, Token specifier (value) (see Figure 5.6)<br />

17


Lexical Analysis<br />

• Tokens might be defined by grammar rules<br />

to be recognized by the parser:<br />

• For better efficiency, a scanner can be used<br />

instead to recognize and output the tokens in<br />

a sequence represented by fixed-length<br />

codes (such as integers) and the associated<br />

token specifiers.<br />

18


Token Specifier<br />

•The scanner is designed to enter identifier<br />

directly into a symbol table when they are first<br />

recognize.<br />

•A token specifier for a identifier is a pointer to<br />

the correspondings symbol-table entry (e.g.,<br />

^SUM for identifier, #100 for integer).<br />

–Avoid much of the need for table searching during<br />

the rest of the complication process.<br />

19


Scanner Output<br />

•Token specifier<br />

–Identifier name, integer value, (type)<br />

•Token coding scheme<br />

–Figure 5.5<br />

20


Lexical<br />

Scan<br />

21


Parser vs. Scanner<br />

•The scanner operates as a procedure that is called<br />

by the parser when it needs another token.<br />

•Each call to the scanner would produce the<br />

coding for the next token in the source program.<br />

•The parser would responsible to saving any token<br />

that it might require for later analysis.<br />

22


Languages<br />

•In FORTRAN<br />

–DO 10 I = 1, 100<br />

• DO: keyword<br />

• 10: a statement number<br />

• I: identifier<br />

–DO 10 I = 1<br />

• DO10I: identifier<br />

23


Special Statement of FORTRAN<br />

IF (THEN .EQ. ELSE) THEN<br />

IF = THEN<br />

ELSE<br />

THEN = IF<br />

ENDIF<br />

24


Token Recognizer<br />

•By grammar<br />

::= ||<br />

::= A | B | C | D | … | Z<br />

::= 0 | 1 | 2 | 3 | … | 9<br />

•By scanner - modeling as finite automata<br />

–Figure 5.8 (a)<br />

25


Modeling Scanners as Finite<br />

Automata<br />

• Tokens can often be recognized by a finite automaton,<br />

which consists of<br />

–A finite set of states (including a starting state and one or<br />

more final states)<br />

–A set of transitions from one state to another<br />

26


Finite Automata for Scanner<br />

•If the automata stops in<br />

a final state, we say<br />

that it recognizes (or<br />

accepts) the string<br />

being scanned.<br />

•If it stops in a nonfinal<br />

state, it fails to<br />

recognize (or reject)<br />

the string.<br />

27


Finite Automata for Typical Tokens<br />

The finite automata can recognize<br />

all of the tokens in Figure 5.<br />

Underscore character<br />

The notation A-Z specifies any character from A to Z<br />

28


Token<br />

Recognition<br />

Algorithm<br />

A typical algorithm to<br />

recognize identifiers<br />

may contain underscores.<br />

30


Syntactic Analysis<br />

Operator-Precedence Parsing<br />

Recursive-Descent Parsing<br />

31


Syntactic Analysis<br />

•Syntactic analysis: building the parse tree for the<br />

statements being translated<br />

•Parse tree<br />

–Root: goal grammar rule<br />

–Leaves: terminal symbols<br />

•Methods:<br />

–Bottom-up: operator-precedence parsing<br />

–Top-down: recursive-descent parsing<br />

32


Syntactic Analysis<br />

• Recognize source statements as language constructs or<br />

build the parse tree for the statements.<br />

–Bottom-up<br />

• Operator-precedence parsing<br />

• Shift-reduce parsing<br />

• LR(0) parsing<br />

• LR(1) parsing<br />

• SLR(1) parsing<br />

• LALR(1) parsing<br />

–Top-down<br />

• Recursive-descent parsing<br />

• LL(1) parsing<br />

33


Precedence<br />

•A + B * C –D<br />

•Multination and division have higher precedence<br />

than addition and subtraction.<br />

–+ has lower precedence than *<br />

<<br />

• + *<br />

•In terms of the parse tree, this means that the *<br />

operation appears at a lower level than does<br />

either + or -.<br />

• > : the previous one has higher precedence than<br />

the later one<br />

• : the two tokens have equal precedence.<br />

=<br />

34


Operator-Precedence Parsing<br />

• The operator-precedence method uses the precedence<br />

relation between consecutive operators to guide the<br />

parsing processing.<br />

A + B * C - D<br />

<br />

• Subexpression B*C is to be computed first because *<br />

has higher precedence than the surrounding operators,<br />

this means that * appears at a lower level than does +<br />

or –in the parse tree.<br />

• Precedence:<br />

< = ><br />

35


Precedence Matrix<br />

later<br />

previous<br />

36


Precedence<br />

•; END END ;<br />

< ><br />

–When ; is followed by END, the ; has higher<br />

precedence.<br />

–When END is followed by ;, the END has higher<br />

precedence.<br />

•Empty means that these two tokens cannot<br />

appear together in any legal statement.<br />

•; BEGIN and ; BEGIN can not<br />

exist.<br />

< ><br />

37


Operator-Precedence Parsing<br />

•The parser has identified the portion of the<br />

statement delimited by the precedence relations<br />

and to be interpreted in terms of the grammar.<br />

><br />

•An operator-precedence parser generally uses a<br />

stack to save tokens that have scanned but not yet<br />

parsed, so it can re-examine them.<br />

<<br />

38


Example: READ ( VALUE )<br />

39


Example: VARIANCE:=SUMSQ DIV 100 –MEAN*MEAN<br />

40


Example: VARIANCE:=SUMSQ DIV 100 –MEAN*MEAN<br />

41


Example: VARIANCE:=SUMSQ DIV 100 –MEAN* MEAN*MEAN<br />

42


Example: VARIANCE:=SUMSQ DIV 100 –MEAN* MEAN*MEAN<br />

43


Bottom-up Parsing<br />

•Each of the parse tree is constructed from the<br />

terminal nodes up toward the root.<br />

44


Operator Precedence vs.<br />

Shift-Reduce Parsing<br />

•The idea behind the operator precedence<br />

technique are developed into shift-reduce parsing.<br />

45


Shift-Reduce Parsing<br />

• Operator-precedence parsing can deal with the<br />

operator grammars having the property that no<br />

production right side has two adjacent nonterminals.<br />

• Shift-reduce parsing<br />

–It makes use of a stack to store tokens that have not yet been<br />

recognized in terms of grammar.<br />

–Actions:<br />

• Shift: push the current token onto the stack<br />

–Shift roughly corresponds to the action taken by an operatorprecedence<br />

parser when it encounters the relations < and .<br />

• Reduce: recognize symbols on top of the stack according to a<br />

grammar rule.<br />

–Reduce roughly corresponds to the action taken by an operatorprecedence<br />

parser when it encounters the relations .<br />

• The most powerful shift-reduce parsing technique is<br />

called LR(k).<br />

><br />

= <br />

46


Example: READ ( VALUE )<br />

47


Recursive-Descent Parser<br />

•A recursive-descent parser is made up of a<br />

procedure for each nonternimal symbol in the<br />

grammar.<br />

•Each nonterminal symbol in the grammar is<br />

associated with a procedure.<br />

•When a procedure is called, it attempt to find<br />

substring of the input, beginning with the current<br />

token.<br />

48


Left Recursion<br />

• ::= | ;<br />

–If the procedure decides to try the second alternative<br />

(;), it would immediately call itself<br />

reclusively to find an ().<br />

–Results in an unending chain.<br />

•Modification<br />

– ::= {;}<br />

49


Recursive-Descent Parsing<br />

• A recursive-descent parser is made up of a procedure<br />

for each nonterminal symbol in the grammar.<br />

–The procedure attempts to find a substring of the input that<br />

can be interpreted as the nonterminal.<br />

–The procedure may call other procedures, or even itself<br />

recursively, to search for other nonterminals.<br />

–The procedure must decide which alternative in the<br />

grammar rule to use by examining the next input token.<br />

• Top-down parsers cannot be directly used with a<br />

grammar containing immediate left recursion.<br />

–An unending chain<br />

• Two grammar<br />

– ::= id | , id<br />

– ::= id { , id }<br />

50


Extension to BNF<br />

•id {, id }<br />

–The terms between { and } may be omitted, or<br />

repeated one or more times.<br />

–With the revised definition, the procedure simply<br />

looks first for an id, and then keeps scanning the<br />

input as long as the next two tokens are a comma (,)<br />

and id.<br />

51


Modified Grammar without Left Recursion<br />

still recursive, but a<br />

chain of calls always<br />

consume at least one<br />

token<br />

52


Recursive-Descent Parsing of<br />

READ<br />

53


Recursive-Descent Parsing of<br />

IDLIST<br />

54


check_read()<br />

{<br />

if( get_token()==‘READ’&&<br />

get_token()==‘(’&&<br />

check_id-list()==true &&<br />

get_token()==‘)’)<br />

return(true);<br />

else<br />

return(false);<br />

}<br />

55


check_prog()<br />

{<br />

if( get_token()==‘PROGRAM’&&<br />

check_prog-name()==true &&<br />

get_token()==‘VAR’&&<br />

check_dec-list()==true &&<br />

get_token()==‘BEGIN’&&<br />

check_stmt-list()==true &&<br />

get_token()==‘END.’)<br />

return(true);<br />

else<br />

return(false);<br />

}<br />

56


check_for()<br />

{<br />

if( get_token()==‘FOR’&&<br />

check_index-exp()==true &&<br />

get_token()==‘DO’&&<br />

check_body()==true)<br />

return(true);<br />

else<br />

return(false);<br />

}<br />

57


check_stmt()<br />

{<br />

/* Resolve alternatives by look-ahead */<br />

if( next_token()==id )<br />

return check_assign();<br />

if( next_token()==‘READ’)<br />

return check_read();<br />

if( next_token()==‘WRITE’)<br />

return check_write();<br />

if( next_token()==‘FOR’)<br />

return check_for();<br />

}<br />

58


Left Recursive<br />

• 3 ::=|;<br />

• 3a ::={;}<br />

check_dec-list()<br />

{<br />

flag=true;<br />

if(check_dec()==false)<br />

flag=false;<br />

while(next_token()==‘;’)<br />

{<br />

get_token();<br />

if(check_dec()==false)<br />

flag=false;<br />

}<br />

return flag;<br />

}<br />

59


• 10 ::=|+|-<br />

• 10a ::={+|-}<br />

check_exp()<br />

{<br />

flag=true;<br />

if(check_term()==false)<br />

flag=false;<br />

while(next_token()==‘+’or next_token()==‘-’)<br />

{<br />

get_token();<br />

if(check_term()==false)<br />

flag=false;<br />

}<br />

return flag;<br />

}<br />

60


Example: READ ( VALUE )<br />

61


Recursive-Descent Procedure for Assign<br />

::= id := <br />

62


Recursive-Descent Procedure for EXP<br />

::= { + | - }<br />

63


Recursive-Descent Procedure for TERM<br />

::= { * | DIV }<br />

64


Recursive-Descent Procedure for FACTOR<br />

::= id | int | ()<br />

65


Recursive-Descent Parsing (1/3)<br />

id1 := SUMSQ DIV 100 –MEAN * MEAN<br />

66


Recursive-Descent Parsing (2/3)<br />

67


Recursive-Descent Parsing (3/3)<br />

68


Code Generation<br />

• When the parser recognizes a portion of the source<br />

program according to some rule of the grammar, the<br />

corresponding semantic routine (code generation<br />

routine) is executed.<br />

• As an example, symbolic representation of the object<br />

code for a SIC/XE machine is generated.<br />

• Two data structures are used for working storage:<br />

–A list (associated with a variable LISTCOUNT)<br />

–A stack<br />

69


• SUM,SUMQ,I,VALUE,MEAN,VARIANCE:INTEGER;<br />

– SUM WORD 0<br />

– SUMQ WORD 0<br />

– I WORD 0<br />

– VALUE WORD 0<br />

– MEAN WORD 0<br />

– VARIANCE WORD 0<br />

• SUM:=0;<br />

– LDA #0<br />

– STA SUM<br />

• SUM:=SUM+VALUE;<br />

– LDA SUM<br />

– ADD VALUE<br />

– STA SUM<br />

70


• VARIANCE := SUMQ DIV 100 –MEAN * MEAN;<br />

– TEMP1 WORD 0<br />

– TEMP2 WORD 0<br />

– TEMP3 WORD 0<br />

– LDA SUMQ<br />

– TEMP WORD 0<br />

– DIV #100<br />

– LDA MEAN<br />

– STA TEMP1<br />

– MUL MEAN<br />

– LDA MEAN<br />

– STA TEMP<br />

– MUL MEAN<br />

– LDA SUMQ<br />

– STA TEMP2<br />

– DIV #100<br />

– LDA TEMP1<br />

– SUB TEMP<br />

– SUB TEMP2<br />

– STA VARIANCE<br />

– STA TEMP3<br />

– LDA TEMP3<br />

– STA VARIANCE<br />

71


Example: READ ( VALUE )<br />

Argument passing<br />

placed in register L<br />

72


Terminology<br />

• Token specifier S(id) is the name of the identifier, or<br />

pointer to the symbol-table entry.<br />

• S(int) is the value of the integer.<br />

• The node specifier S() is set to rA, indicating that<br />

the result of the computation is in register A.<br />

• The variable REGA is used to indicate the highest-level<br />

node of the parse tree whose value is left in register A<br />

by the code generated so far.<br />

• Procedure GETA generates a LDA instruction to load a<br />

value into register A.<br />

73


Example: VARIANCE:=SUMSQ DIV 100 –MEAN*MEAN<br />

74


Example: VARIANCE:=SUMSQ DIV 100 –MEAN*MEAN<br />

75


Other Code-<br />

Generation<br />

Routines<br />

76


Other Code-<br />

Generation<br />

Routines<br />

77


Compiler<br />

•Basic Compiler Functions<br />

•Machine-Dependent Compiler Features (5.2)<br />

•Machine-Independent Compiler Features (5.3)<br />

79


Intermediate Form<br />

• The syntax and semantics of the source statements have<br />

been completely analyzed, but the actual translation into<br />

machine code have not yet been performed.<br />

• It is much easier to analyze and manipulate the<br />

intermediate form of a program than the machine code.<br />

• Operation, op1, op2, result<br />

–Operation: some function to be performed by the object code<br />

–op1 and op2 are the operands for the operation<br />

–Result: where the resulting value is to be placed<br />

80


Example<br />

81


Code Optimization<br />

83


Potential Improvement<br />

84


Intermediate Form of the<br />

Program<br />

• Representation of the executable instructions with a<br />

sequence of quadruples:<br />

operation, op1, op2, result<br />

• For example:<br />

85


Intermediate<br />

Code<br />

86


Quadruple Analysis for Code Optimization<br />

•Intermediate results can be assigned to registers<br />

or to temporary variables to make their use as<br />

efficient as possible.<br />

•Quadruples can be rearranged to eliminate<br />

redundant load and store operations.<br />

87


Assignment and Use of Registers as<br />

Instruction Operands<br />

• We would prefer to keep in registers all variables and<br />

intermediate results that will be used later in the program.<br />

• Consider “VALUE”in quadruples 7 and 9, “MEAN”in<br />

quadruples 16 and 18.<br />

• Register selection for replacement:<br />

– Scan the program for the next point at which each register value would<br />

be used.<br />

– Select the one whose value will not be needed for the longest time.<br />

– Save the value of the selected register to a temporary variable if<br />

necessary.<br />

• Be careful about the control flow of the program when<br />

assigning and using registers:<br />

– Consider “SUM”in quadruples 1 and 7.<br />

88


Basic Blocks<br />

•One way to deal with the control flow is to divide<br />

the program into basic blocks.<br />

•A basic block is a sequence of quadruples with<br />

one entry point (beginning of the block), one exit<br />

point (end of the block), and no jumps within the<br />

block.<br />

•Assignment and use of registers within a basic<br />

block can follow the method previously<br />

described.<br />

89


Basic Blocks<br />

A<br />

B<br />

C<br />

D<br />

E<br />

90


Rearrangement of Quadruples<br />

DIV SUMSQ #100 i1<br />

* MEAN MEAN i2<br />

- i1 i2 i3<br />

:= i3 VARIANCE<br />

* MEAN MEAN i2<br />

DIV SUMSQ #100 i1<br />

- i1 i2 i3<br />

:= i3 VARIANCE<br />

LDA SUMSQ<br />

DIV #100<br />

STA T1<br />

LDA MEAN<br />

MUL MEAN<br />

STA T2<br />

LDA T1<br />

SUB T2<br />

STA VARIANCE<br />

LDA MEAN<br />

MUL MEAN<br />

STA T1<br />

LDA SUMSQ<br />

DIV #100<br />

SUB T1<br />

STA VARIANCE<br />

91


Compiler<br />

•Basic Compiler Functions<br />

•Machine-Dependent Compiler Features (5.2)<br />

•Machine-Independent Compiler Features (5.3)<br />

•Compiler Design Options (5.4)<br />

92


Structured Variables<br />

•Array<br />

•Record<br />

•String<br />

•Set<br />

93


Array<br />

•A: ARRAY[1..10] OF INTEGER<br />

•If each INTEGER variable occupies one word of<br />

memory, we must allocate ten words to store this<br />

array.<br />

•General case<br />

–ARRAY[l..u] of INTEGER<br />

–Allocate u-l+1 words of storage for this array<br />

94


Multi-dimensional Array<br />

•B: ARRAY[0..3, 1..6]<br />

–4*6=24 words<br />

•General case<br />

–ARRAY[l 1 ..u 1 , l 2 ..u 2 ] of INTEGER<br />

• The number of words to be allocated is (u 1 -l 1 +1)*(u 2 -l 2 +1)<br />

95


Row-Major vs. Column-Major<br />

•Row-major<br />

–All array elements that nave the same value of the<br />

first subscript are stored in contiguous locations<br />

0,1 0,2 0,3 0,4 1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4 3,1 3,2 3,3 3,4 4,1 4,2 4,3 4,4<br />

Row 0 Row 1 Row 2 Row 3 Row 4<br />

•Column-major<br />

–All array elements that nave the same value of the<br />

second subscript are stored in contiguous locations<br />

0,1 1,1 2,1 3,1 4,1 0,2 1,2 2,2 3,2 4,2 0,3 1,3 2,3 3,3 4,3 0,4 1,4 2,4 3,4 4,4<br />

Column 1 Column 2 Column 3 Column 4<br />

96


Array Reference<br />

•How to calculate the address of the referenced<br />

relative to the base address of the array<br />

•A: ARRAY[1..10] OF INTEGER<br />

–A[6]: the starting address relative to the starting<br />

address is 5*3= 15.<br />

•General case:<br />

–ARRAY[l..u] OF INTEGER and each array element<br />

occupies w bytes of storage<br />

–A[s]: the relative address of A[s] is w*(s-l)<br />

97


Two-Dimensional Array<br />

•B: ARRAY[0..3, 1..6]<br />

–B[2, 5]<br />

• 2 * 6 + 4 = 16<br />

Reference<br />

•B: ARRAY[l 1 ..u 1 , l 2 ..u 2 ] of INTEGER<br />

–The relative address of B[s 1 , s 2 ] is w *[(s 1 -l 1 )*(u 2 - l 2<br />

+1)+ (s 2 - l 2 )]<br />

98


Code Generation for Array<br />

References (1/2)<br />

A: ARRAY[1..10] of INTEGER<br />

…<br />

A[I] := 5<br />

(1) - I #1 i1<br />

(2) * i1 #3 i2<br />

(3) := #5 A[i2]<br />

99


Code Generation for Array<br />

References (2/2)<br />

B: ARRAY[0..3, 1..6] of INTEGER<br />

…<br />

B[I, J] := 5<br />

(1) * I #6 i1<br />

(2) - J #1 i2<br />

(3) + i1 i2 i3<br />

(4) * i3 #3 i4<br />

(5) := #5 B[i4]<br />

100


Machine-Independent<br />

• Common subexpression<br />

Code Optimization<br />

–These are subexpressions that appear at more than one point in<br />

the program and that compute the same value.<br />

–Common subexpressions are usually detected through the<br />

analysis of the intermediate form of the program.<br />

• Loop invariants<br />

–These are subexpressions within a loop whose values do not<br />

change from one iteration of the loop to the next.<br />

–Their values can be computed once before the loop is entered,<br />

rather than being recalculated for each iteration.<br />

• Reduction in strength of an operation<br />

101


102


Common Subexpression Elimination<br />

103


Loop Invariant Elimination<br />

104


Reducing in Strength of Operations<br />

105


Code Optimization<br />

• Some optimization can be obtained by rewriting the<br />

source program, e.g.,<br />

T1 := 2 * J;<br />

T2 := T1 –1;<br />

FOR I := 1 TO 10 DO<br />

X[I, T2] := Y[I, T1]<br />

• However, this would achieve only a part of the<br />

benefits of code optimization.<br />

• An optimizing compiler should allow the programmer<br />

to write source code that is clear and easy to read, and<br />

it should compile such a program into machine code<br />

that is efficient to execute.<br />

106


Static Storage Allocation<br />

<strong>System</strong><br />

<strong>System</strong><br />

<strong>System</strong><br />

Main<br />

Main<br />

Main<br />

Call SUB<br />

Call SUB<br />

RETARD<br />

RETARD<br />

RETARD<br />

SUB<br />

SUB<br />

RETARD<br />

Call SUB<br />

RETARD<br />

107


Dynamic Storage Allocation<br />

<strong>System</strong><br />

<strong>System</strong><br />

Main<br />

B<br />

Variables<br />

for Main<br />

RETARD<br />

NEXT<br />

0<br />

Stack<br />

Main<br />

Call SUB<br />

B<br />

Variables<br />

for SUB<br />

RETARD<br />

NEXT<br />

PREV<br />

SUB<br />

Variables<br />

for Main<br />

RETARD<br />

NEXT<br />

0<br />

Stack<br />

108


Variables<br />

for SUB<br />

<strong>System</strong><br />

Main<br />

Call SUB<br />

B<br />

RETARD<br />

NEXT<br />

PREV<br />

Variables<br />

for SUB<br />

RETARD<br />

NEXT<br />

PREV<br />

<strong>System</strong><br />

Main<br />

Call SUB<br />

B<br />

Variables<br />

for SUB<br />

RETARD<br />

NEXT<br />

PREV<br />

SUB<br />

Call SUB<br />

Variables<br />

for Main<br />

RETARD<br />

NEXT<br />

0<br />

Stack<br />

SUB<br />

Variables<br />

for Main<br />

RETARD<br />

NEXT<br />

0<br />

109<br />

Stack


Block-Structured Languages<br />

PROCEDURE A;<br />

VAR X, Y, Z: INTEGER;<br />

PROCEDURE B;<br />

VAR W, X, Y: REAL;<br />

Block<br />

Name<br />

Block<br />

Number<br />

Block<br />

Level<br />

Surrounding<br />

Block<br />

PROCEDURE C;<br />

VAR V, W: INTEGER;<br />

A<br />

1<br />

1<br />

-<br />

END {C};<br />

B<br />

2<br />

2<br />

1<br />

END {B};<br />

C<br />

3<br />

3<br />

2<br />

PROCEDURE D;<br />

VAR X, Z: CHAR;<br />

D<br />

4<br />

2<br />

1<br />

END {D};<br />

END {A};<br />

110


Compiler<br />

•Basic Compiler Functions<br />

•Machine-Dependent Compiler Features (5.2)<br />

•Machine-Independent Compiler Features (5.3)<br />

•Compiler Design Options (5.4)<br />

111


Compiler Design Options<br />

•Division into passes<br />

•Interpreter<br />

•P-Code compiler<br />

•Compiler-Compilers<br />

112


Division into Passes<br />

•In some languages, the declaration of an<br />

identifier may appear it has been used in the<br />

program. (forward reference)<br />

113


Interpreters<br />

•The interpreters execute a version of the source<br />

program directly, instead of translating it into<br />

machine code.<br />

•The advantage is in the debugging facilities.<br />

114


P-Code Compilers<br />

•The main advantage is portability.<br />

Source Program<br />

Compile<br />

P-code<br />

Compiler<br />

Object Program<br />

(P-code)<br />

Execute<br />

P-code<br />

Interpreter 115


Compiler-Compilers<br />

Compilers<br />

Lexical rules<br />

Scanner<br />

Grammar<br />

Semantic<br />

routines<br />

Compiler-compiler<br />

Parser<br />

Code<br />

generator<br />

116


Summary<br />

•Basic Compiler Functions<br />

•Machine-Dependent Compiler Features<br />

•Machine-Independent Compiler Features<br />

•Compiler Design Options<br />

117


Another Example<br />

118


Three-Address Code<br />

119


Flow Graph<br />

120


Local Common Subexpression<br />

Elimination<br />

121


Global Common Subexpression<br />

Elimination<br />

122


Copy Propagation<br />

• Improve the code in B5 by eliminating x:<br />

x := t3<br />

a[t2] := t5<br />

a[t4] := t3<br />

goto B2<br />

• The idea is to use g for f, wherever possible after the<br />

copy statement<br />

f:=g<br />

• This may not appear to be an improvement, but it gives<br />

us the opportunity to eliminate the assignment to x.<br />

123


Dead-Code Elimination<br />

•A variable is live at a point in a program if its<br />

value can be used subsequently; otherwise, it is<br />

dead (or useless) at that point.<br />

•Copy propagation followed by dead-code<br />

elimination removes the assignment to x:<br />

a[t2] := t5<br />

a[t4] := t3<br />

goto B2<br />

124


Loop Optimizations<br />

• The running time of a program may be improved if we<br />

decrease the number of instructions in an inner loop.<br />

• Three techniques are import for loop optimization:<br />

–Code motion<br />

• Moves code outside a loop<br />

–Reduction in strength<br />

• Replaces an expensive operation by a cheaper one<br />

–Induction-variable elimination<br />

• Eliminates variable from the inner loop<br />

125


Strength Reduction<br />

126


Induction-Variable Elimination<br />

induction variables<br />

induction variables<br />

127


Code Optimization Result<br />

128

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!