(club 75)|| SIZ-EDUCATION|| HOW COMPUTER PROGRAMS (CODE) ARE COMPILED by @christnenye
Hello everyone, welcome back to my blog. it's really nice to write today.
We are all familiar with this Computer term called programming in which programs are coined out from but we are not familiar with how these programs run to give the desired output. Every Computer programs are written in high-level language (English-like) but the computer only understands machine language which is binary (0,1).
So today I will be writing on compilation of computer programs from high-level programming language to machine language.
A software that converts the source code which is written in the high level language into the object code that is the machine language is the compiler. Therefore, compilation is the process it takes the compiler to convert a high level programming language into a low-level (machine language).
In compilation, the compiler passes through phases or stages in converting the source code to machine code. These phases have what to they do to the source code in order to get to the desired machine code(0,1).
HOW DOES THE COMPILER COMPILE COMPUTER PROGRAMS
The compiler work in phases, where the output of the previous phase becomes the input of the next phase. The below diagram illustrates
Designed with pixel lab
The compilation process starts when a user types in a set of code(source code), the set of codes passes through a sequence of phases which starts from;
What the compiler does here is it scans the source code, example scanning of a document. The compiler scans the source to discover what the call lexeme (A unit instance of a continuous source code sequence without spaces). It then places each lexeme in the Token they belong to.
The source code declared is Sum= X + Y + 1;
the individual term here are the lexeme but the all belong to separate token, of which "sum" belong to a token "identifier", "=" belong to " assignment operator", "X,Y" belong to "identifer", "1" belong to "constant" and ";" is the end of the statement.
The output of these phase is the TOKEN which will be the input of the next phase;
The compiler takes the "TOKEN" produced by lexical analysis and check the token arrangement against the source code grammar word by word and if it's correct it generates what is called a parse tree and if it's wrong generates syntax error.
The source code given is (Sum=X + Y+ 1), Then Token generated is;
Identifier = Identifier + identifier + constant;
If the source code and the placement of the token is not well placed syntax error is generated.
The output of the syntax analyzer PARSE TREE is the input of the;
The compiler takes the PARSE TREE and checks if it follows the rules of language of assignment of same data type in a particular programming language and if it follows the rule it will generate Annotated parse tree which is a meaningful parse tree of the syntax analyzer.
Where the source code (Sum= X+Y +1) is assigned values of Sum = 2 + 2 +1;
Sum= 2(interger)+ 2(interger) +1;
The compiler checks if the numbers to sum are of the same data type of which the data type above it's of interger value. It then generates what is called ANNOTATED PARSE TREE which is the input of the;
INTERMEDIATE CODE GENERATOR
Here, the compiler transform the annotated parse tree from the semantic analyzer into an optimized code known Three address code
If a user assigns values for the source code Sum = X+ Y +1 as "2.44 + 2+ 1", here the compiler has to convert the float number (2.44) to an interger value declared before it will sum the equation.
In this phase the three address code is generated which is the input of;
Here the compiler does what is called Optimization on the three address code that is removing the unnecessary lines of code for the program to run faster, and utilize CPU time.
The three address code is converted to assembly language.
Load R1, X --------- load X into register 1
Load R2, Y ---------- load Y into register 2
Load R3, 1 ----------load 1 into register 3
ADD R1,R2,R3------ add the register 1+ register 2 + register 3 and keep it in register 1
STORE Sum ,R1--------store the value in register 1 into Sum.
The above is interpreted by the INTERPRETER Into the equivalent binary representation which is machine language (0,1).
Humans write in high-level language because it's less complex than machine code and are easier to understand. The only language the computer understands is binary digits, so there is need to understand what happens behind the screen. Every phase has it's output which is the input of another phase.
I am @christnenye, thanks for your time.