Phases of Compiler Design
In this class, We discuss Phases of Compiler Design.
For Complete YouTube Video: Click Here
The reader should have prior knowledge of Introduction to the compiler. Click Here.
We take an example and understand the phases of a compiler.
Example:
Float a= 2.2, b = 5.5, c = 6;
a = b + c * 60;
We convert the high-level language to machine-level code.
It undergoes different phases during the conversion.
The below diagram shows the complete phases and a symbol table.
First Phase:
The first phase is the lexical analysis phase.
In the lexical analysis phase, the source program is read character by character and separate keywords, identifiers, symbols, etc.
The separated keywords, identifiers, etc., we call them tokens.
The output of the lexical analysis phase is tokens.
During the conversion, the lexical analysis phase uses a symbol table.
In the symbol table, we maintain identifiers, the identifier type, and other attributes related to identifiers.
We discuss later the complete details of the symbol table.
In our example, we use a single line a = b + c * 60.
The above diagram shows the tokens identified in the lexical analysis phase.
We call this lexical analysis phase as scanning phase.
Second Phase:
The second phase is the syntax analysis phase.
We give the tokens as input to the syntax analysis phase.
The syntax analysis phase is called the parsing phase.
The output generated by the syntax analysis phase is the syntax tree or parse tree.
The parse tree generated will help to check the syntactical mistakes.
The expression a = b + c * 60 will evaluate b * 60 first.
The * operator has the highest precedence.
The syntax tree generated evaluates the expression by following the precedence and associative rules.
Third Phase:
Semantic Analysis:
The semantic analysis phase will check the meaning of the language.
We check the conditions imposed on the language in the semantic analysis phase.
Example:
1) The identifiers used in the expression are declared before.
2) Auto Type conversion need.
In our example, in the semantic analysis phase, integer value 60 is auto-converted to float.
The input is a syntax tree generated at the second phase.
The output generated is a syntax tree with modifications required.
Fourth Phase:
Intermediate Code Generator:
The syntax tree generated at the semantic phase is input to the intermediate code generator.
We generate the intermediate code in this phase.
Looking at the intermediate code, do not think it is the final code.
The code generated is similar to our Computer instructions.
There are many formats for intermediate code.
We follow three address codes. Most of them will follow three address code formats.
The above diagram shows the intermediate code.
The intermediate code is machine-independent code, and here machine means the processor.
Fifth Phase:
Machine independent code optimization:
Optimization converts the four lines of machine-independent code to two lines.
Optimization means writing efficient code.
The optimization phase is optional.
Sixth Phase:
Code Generator:
The code generator phase will generate the machine-dependent code.
If we are using x 86 architecture, we have some instructions.
We generate the code According to the instruction set.
The above code shows the machine-level code.
Seventh Phase:
Machine dependent Optimizer:
This phase will optimize the machine-dependent code.
The optimizer phases are optional.
Note: Every Phase is going to use the symbol table