Writing a simple Compiler on my own - Revisit Queue and Parameter Checking (part 4) [C][Flex][Bison]

in utopian-io •  11 months ago 

[Custom Thumbnail]
All the Code of the series can be found at the Github repository: https://github.com/drifter1/compiler

Introduction

    Hello it's a me again @drifter1! Today we continue with my Compiler Series, a series where we implement a complete compiler for a simple C-like language by using the C-tools Flex and Bison. In this article we will get into the final part of Parameter checking, where we will be using the code/functions/changes of the previous parts in the action rules of the parser!

The topics that we will cover today are:

  1. Function call rule
  2. Function declaration rules
  3. Managing the print function

Requirements:

    Actually you need to read and understand all the topics that I covered in the series as a whole, as these articles will give you access to knowledge about:
  • What Compiler Design is (mainly the steps)
  • For which exact Language the Compiler is build for (Tokens and Grammar)
  • How to use Flex and Bison
  • How to implement a lexer and parser for the language using those tools
  • What the Symbol Table is and how we implement it
  • How we combine Flex and Bison together
  • How we can pass information from the Lexer to the Parser
  • How we define operator priorities, precedencies and associativity
  • What Semantic Analysis is (Attributes, SDT etc.)
  • How we do the so called "Scope Resolution"
  • How we declare types and check the type inter-compatibility for different cases, including function parameters
  • How we check function calls later on using a "Revisit queue", as function declarations mostly happen after functions get used (in our Language)
  • Intermediate Code Representations, including AST's
  • How we implement an AST (structure and management)
  • Action Rules for AST nodes and more

Difficulty:

Talking about the series in general this series can be rated:
  • Intermediate to Advanced
Today's topic(s) can be rated:
  • Medium
So, without further ado, let's now finally start with the actual Tutorial...

Actual Tutorial Content

Function call rule

    Creating the needed code and functions for the parameter checking is not enough. This code also has to be used in the respective parts of the parser, and so inside of the correct action rules. Let's start with the most important part, which is managing the revisit queue entry from the "function-call"-rule.

The revisit queue information is represented by the diagram:

Parameter information

    In this rule we also manage the AST node of type "Func_Call", which stores the function identifier (symbol table entry) and the calling parameters in form of expression nodes (params), together with their count (num_of_pars). To create this node we use the information stored inside of the "call_params"-rule, or better said it's "AST_Node_Call_Params" node, which get's type-casted to a variable "temp" of that type. The needed parameter information is stored inside of that variable, on so we don't have to re-type-cast the AST node of the "function_call"-rule (stored in "$$"). So, getting the parameter information is a piece of cake!

Revisit Queue entry

    Getting the revisit queue entry (that might exist), where the information of the function calls get's stored to be revisited later on, is also quite easy! We implemented a search queue function, that returns such an entry or NULL if it doesn't exist. So, the first important part is distinguishing these two cases:
  1. Entry exists => Add parameter information to revisit queue entry
  2. Entry doesn't exist => This means that the function was declared before and so perform parameter check
    When the entry exists, we will also have to distinguish two cases for the memory allocation. We will perform a initial allocation (malloc) when the number of calls is zero and a reallocation (realloc) when the number of calls is not zero. The rest is the same, as we just have to add the info, using the "expression_data_type" function to get the datatype of each parameter-expression.

    When the entry doesn't exist, meaning that the function was declared previously, we should also check if the function was REALLY declared! This can be done easily checking if the st_type of the entry is "FUNCTION_TYPE", or by checking if the inf_type is not "UNDEF". Let's do the first of them :)

The action code

    Following all that was mentioned previously, the action code of the "function_call"-rule becomes:
function_call: ID LPAREN call_params RPAREN
{
	AST_Node_Call_Params *temp = (AST_Node_Call_Params*) $3;
	$$ = new_ast_func_call_node($1, temp->params, temp->num_of_pars);
/* add information to revisit queue entry (if one exists) */ revisit_queue *q = search_queue($1->st_name); if(q != NULL){ /* setup structures */ if(q->num_of_calls == 0){ /* first call */ q->par_types = (int**) malloc(sizeof(int*)); q->num_of_pars = (int*) malloc(sizeof(int)); } else{ /* general case */ q->par_types = (int**) realloc(q->par_types, (q->num_of_calls + 1) * sizeof(int*)); q->num_of_pars = (int*) realloc(q->num_of_pars, (q->num_of_calls + 1) * sizeof(int)); }
/* add info of function call */ q->num_of_pars[q->num_of_calls] = temp->num_of_pars; q->par_types[q->num_of_calls] = (int*) malloc(temp->num_of_pars * sizeof(int)); /* get the types of the parameters */ int i; for(i = 0; i < temp->num_of_pars; i++){ /* get datatype of parameter-expression */ q->par_types[q->num_of_calls][i] = expression_data_type(temp->params[i]); }
/* increment number of calls */ q->num_of_calls++; } else{ /* function declared before call */ if($1->st_type == FUNCTION_TYPE){ /* check number of parameters */ if($1->num_of_pars != temp->num_of_pars){ fprintf(stderr, "Function call of %s has wrong num of parameters!\n", $1->st_name); exit(1); } /* check if parameters are compatible */ int i; for(i = 0; i < temp->num_of_pars; i++){ /* type of parameter in function declaration */ int type_1 = expression_data_type(temp->params[i]);
/* type of parameter in function call*/ int type_2 = $1->parameters[i].par_type;
/* check compatibility for function call */ get_result_type(type_1, type_2, NONE); /* error occurs automatically in the function */ } } } } ;

Function declaration rules

    The function declaration rule "function" is split into some parts. The actual changes needed for parameter checking and the revisit queue will be done in the "function" and "function_head" rules. In the first we will just perform the revisit using the "revisit" function. In the other rule we will just use the flag variable "function_decl" instead of "declare" to distinguish the cases of function declarations and "normal" declarations, which is what we covered in the previous part. This is also the place where we should set the datatype of the entry, but this is something that we already did in previous articles! So, let's directly dive into the action rules:
function: { incr_scope(); } function_head function_tail
{ 
	/* perform revisit */
	revisit(temp_function->entry->st_name);
hide_scope(); $$ = (AST_Node *) temp_function; } ;
function_head: { function_decl = 1; } return_type ID LPAREN { function_decl = 0;
AST_Node_Ret_Type *temp = (AST_Node_Ret_Type *) $2; temp_function = (AST_Node_Func_Decl *) new_ast_func_decl_node(temp->ret_type, temp->pointer, $3); temp_function->entry->st_type = FUNCTION_TYPE; temp_function->entry->inf_type = temp->ret_type; } parameters_optional RPAREN { if($6 != NULL){ AST_Node_Decl_Params *temp = (AST_Node_Decl_Params *) $6;
temp_function->entry->parameters = temp->parameters; temp_function->entry->num_of_pars = temp->num_of_pars; } else{ temp_function->entry->parameters = NULL; temp_function->entry->num_of_pars = 0; } } ;

Managing the print function

    Doing all that is enough for the most functions, but we still have to manage our "custom" print function. Without the management of this specific function, we would end up with an undefined function in the symbol table entry and a unchecked entry of the revisit queue. The parameters of the print function can be a variable, string or whole expression, and we are just printing out the value each time, and so there is clearly no need to do any parameter check. :) So, we just have to remove the print entry from the revisit queue, using "search_prev_queue" function, and also declare the function type of the function "print", which is clearly "VOID_TYPE".

The code inside of the "main" function looks like this:
...
/* remove print from revisit queue */ revisit_queue *q = search_prev_queue("print"); if(q == NULL){ /* special case: first entry */ if(queue != NULL){ /* check if queue not empty */ queue = queue->next; } } else{ q->next = q->next->next; }

/* if still not empty -> Warning */ if(queue != NULL){ printf("Warning: Something has not been checked in the revisit queue!\n"); }

/* declare function type of "print" */ func_declare("print", VOID_TYPE, 1, NULL);
...


    Finishing off, the symbol table dump file will now mention that the function is of type "void" like that:


Having no revisit queue error, the revisit queue dump file should now be empty:

RESOURCES

References:

No references, just using code that I implemented in my previous articles.

Images:

All of the images are custom-made!

Previous parts of the series

General Knowledge and Lexical Analysis

Syntax Analysis

  • Syntax Analysis Theory
  • Bison basics
  • Creating a grammar for our Language
  • Combine Flex and Bison
  • Passing information from Lexer to Parser
  • Finishing Off The Grammar/Parser (part 1)
  • Finishing Off The Grammar/Parser (part 2)
  • Semantic Analysis (1)

  • Semantic Analysis Theory
  • Semantics Examples
  • Scope Resolution using the Symbol Table
  • Type Declaration and Checking
  • Function Semantics (part 1)
  • Function Semantics (part 2)
  • Intermediate Code Generation (AST)

  • Abstract Syntax Tree Principle
  • Abstract Syntax Tree Structure
  • Abstract Syntax Tree Management
  • Action Rules for Declarations and Initializations
  • Action Rules for Expressions
  • Action Rules for Assignments and Simple Statements
  • Action Rules for If-Else Statements
  • Action Rules for Loop Statements and some Fixes
  • Action Rules for Function Declarations (part 1)
  • Action Rules for Function Declarations (part 2)
  • Action Rules for Function Calls
  • Semantic Analysis (2)

  • Datatype attribute for Expressions
  • Type Checking for Assignments
  • Revisit Queue and Parameter Checking (part 1)
  • Revisit Queue and Parameter Checking (part 2)
  • Revisit Queue and Parameter Checking (part 3)

  • Final words | Next up on the project

         And this is actually it for today's post! I hope that I explained everything as much as I needed to, meaning that you learned something out of it.
    Next up on this series are:
    • Semantic analysis (using even more action rules in Bison)
    • Machine Code generation (MIPS Assembly)
         Which are all topics that will need more than one article to complete. Also, note that we might also get into Optimizations later on, or could even extend the Language by adding complex datatypes (structs and unions), more rules etc.
    So, see ya next time!

    GitHub Account:

    https://github.com/drifter1

    Keep on drifting! ;)
    Authors get paid when people like you upvote their post.
    If you enjoyed what you read here, create your account today and start earning FREE STEEM!
    Sort Order:  

    Thank you for your contribution.
    Aside from few minor spelling mistakes, overall job well done.

    Your contribution has been evaluated according to Utopian policies and guidelines, as well as a predefined set of questions pertaining to the category.

    To view those questions and the relevant answers related to your post, click here.


    Need help? Write a ticket on https://support.utopian.io/.
    Chat with us on Discord.
    [utopian-moderator]

    Thank you for your review, @mcfarhat! Keep up the good work!

    Congratulations @drifter1! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

    You made more than 10000 upvotes. Your next target is to reach 11000 upvotes.

    Click here to view your Board
    If you no longer want to receive notifications, reply to this comment with the word STOP

    Do not miss the last post from @steemitboard:

    Christmas Challenge - The party continues

    Support SteemitBoard's project! Vote for its witness and get one more award!

    Awesome article buddy, need to check out your other stuff also it seems. Upvoted and resteemed!

    Hey, @drifter1!

    Thanks for contributing on Utopian.
    We’re already looking forward to your next contribution!

    Get higher incentives and support Utopian.io!
    Simply set @utopian.pay as a 5% (or higher) payout beneficiary on your contribution post (via SteemPlus or Steeditor).

    Want to chat? Join us on Discord https://discord.gg/h52nFrV.

    Vote for Utopian Witness!

    Hi @drifter1!

    Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
    Your post is eligible for our upvote, thanks to our collaboration with @utopian-io!
    Feel free to join our @steem-ua Discord server