Writing a simple Compiler on my own - Revisit Queue and Parameter Checking (part 1) [C][Flex][Bison]
[Custom Thumbnail]
All the Code of the series can be found at the Github repository: https://github.com/drifter1/compiler
Introduction
Hello it's a me again @drifter1! Today we continue with my Compiler Series, a series where we implement a complete compiler for a simple C-like language by using the C-tools Flex and Bison. In this article we will visualize function calls, and how we use the Revisit Queue concept to solve the problem of knowing the return type and if the parameters are actually compatible with the parameter declaration, that happens much later on! The actual implementation will come in the next parts...The topics that we will cover today are:
- Revisit Queue Concept
- Visualizing the Problem
Requirements:
Actually you need to read and understand all the topics that I covered in the series as a whole, as these articles will give you access to knowledge about:- What Compiler Design is (mainly the steps)
- For which exact Language the Compiler is build for (Tokens and Grammar)
- How to use Flex and Bison
- How to implement a lexer and parser for the language using those tools
- What the Symbol Table is and how we implement it
- How we combine Flex and Bison together
- How we can pass information from the Lexer to the Parser
- How we define operator priorities, precedencies and associativity
- What Semantic Analysis is (Attributes, SDT etc.)
- How we do the so called "Scope Resolution"
- How we declare types and check the type inter-compatibility for different cases, including function parameters
- How we check function calls later on using a "Revisit queue", as function declarations mostly happen after functions get used (in our Language)
- Intermediate Code Representations, including AST's
- How we implement an AST (structure and management)
- Action Rules for AST nodes and more
Difficulty:
Talking about the series in general this series can be rated:- Intermediate to Advanced
- Medium
Actual Tutorial Content
Revisit Queue Concept
One of the most difficult aspects of the language that we are trying to implement a compiler for is that we are allowed to declare the functions after using them in function calls! In the actual C language you have to declare the function before-hand, and you can define the actual code of the function later on. So, how should we check different kinds of semantics later on? Well, in the function semantics articles, I talked about a concept called "revisit queue". This concept is all about adding elements that should be revisited later on to a queue (or list), and revisiting the queue right after the information that was needed is accessible.In functions, the topics that need to be revisited are:
- parameter compatibility checking
- return datatype for assignment statements
What we want to cover right now is the parameter check!
Visualizing the Problem
After this "recap" of the revisit queue concept, let's now get into a visualization! Functions are described by identifiers (IDs) and so stored inside of a symbol table entry. A function call is one of statement types (when returning void) and can also be used as a part of a larger or smaller expression (when used in assignments). But, either way a function call will occur inside of the statements and maybe even inside of the function declarations part. The function declarations are the final part of each program. In general, we can say that a program looks like this:As you can see, the parameter check takes it's information from the function call and function declaration. But, what exactly get's stored to the entries of the symbol table and revisit queue, while the whole process is in effect? The following diagram shows that:
- The first occurrence (mostly function call) creates the symbol table entry
- A revisit queue entry will be created, as we are not declaring (declare = 0)
- Each function call adds it's information (parameter datatypes and number of parameters) to the rq entry
- The function declaration will add the datatype to the common symbol table entry (can be quite difficult to implement) and trigger the revisit
- During the revisit we will perform the parameter check
Symbol table entry
The symbol table entry stores the information around the function declaration. This information defines the semantics of the function and is used for semantic checks in anything that has to do with functions and mainly function calls. The important pieces of information, that have to do with functions, can be visualized as:Revisit queue entry
The revisit queue entry stores the information around function calls, which means that we store the datatype of each parameter and the actual count of them, for each function call that has to do with a specific function. At first the entry get's created, when creating the symbol table entry, by knowing that we will have to perform a revisit (not currently declaring). That way the whole structure can be visualized as:A complete diagram
The whole problem can be summarized like this:I just added some more specific information to the diagram of the entries. This information includes some of the needed functions and their parameters, to give you a better understanding of how we will implement it later on. I hope that this whole article gave you a great insight to the problem :)
RESOURCES
References:
No references, just using code that I implemented in my previous articles.Images:
All of the images are custom-made!Previous parts of the series
- Introduction -> What is a compiler, what you have to know and what you will learn
- A simple C Language -> Simplified C, comparison with C, tokens, basic structure
- Lexical Analysis using Flex -> Theory, Regular Expressions, Flex, Lexer
- Symbol Table (basic structure) ->Why Symbol Tables, Basic Implementation
- Using Symbol Table in the Lexer -> Flex and Symbol Table combination
- Syntax Analysis Theory -> Syntax Analysis, grammar types and parsing
- Bison basics -> Bison tutorial actually
- Creating a grammar for our Language -> Grammar and first Parser
- Combine Flex and Bison -> lexer and parser combined
- Passing information from Lexer to Parser -> Bug Fix, yylval variable, YYSTYPE union, Setting up the Parser, Passing information "directly" and through the symbol table (special case) with examples.
- Finishing Off The Grammar/Parser (part 1) -> Adding the missing grammar rules, small fixes
- Finishing Off The Grammar/Parser (part 2) -> Operator priorities, precedencies and associativity, complete example file (for testing), grammar rule visualization, finish off grammar/parser
- Semantic Analysis Theory -> What is Semantic Analysis about, CSG and CSL, Attribute Grammars, Syntax Directed Translations (SDT)
- Semantics Examples -> Visualization of Semantics for different rules/cases, needed attributes, what we have to implement from now on
- Scope Resolution using the Symbol Table -> What we have now, scope resolution, integration, compiler output
- Type Declaration and Checking -> Type declaration and type checking code, "simple" code integration
- Function Semantics (part 1) -> Code for parameter definitions, function declarations and parameter compatibility checking
- Function Semantics (part 2) -> Revisit queue concept, basic implementation, inserting undeclared identifiers
- Abstract Syntax Tree Principle -> Intermediate code generation and representations, how to design an AST
- Abstract Syntax Tree Structure -> Node, operator and value types, AST node structures, some management functions, integration with the rest of the compiler
- Abstract Syntax Tree Management -> Tweaking Nodes, Node creation and AST traversal functions, "manual" testing of the AST
- Action Rules for Declarations and Initializations -> Replacing many values with a Value union, Other smaller changes, Action rules for declarations, Action rules for initializations
- Action Rules for Expressions -> Separating operator tokens, Action rules for expressions, running the compiler
- Action Rules for Assignments and Simple Statements -> Action rules for simple statements, Action rules for assignments, running the compiler
- Action Rules for If-Else Statements -> Statements-node, Action rules for “simple” and “complicated” if-else statements, running the compiler
- Action Rules for Loop Statements and some Fixes -> Some fixes, Action rules of while statements, Action rules of for statements, running the compiler
- Action Rules for Function Declarations (part 1) -> Small Tweaks and Fixes, Visualizing the Problem, The new AST nodes and creation functions, Action Rules for Declarations
- Action Rules for Function Declarations (part 2) -> Action Rules for Function declarations, Action Rules for a Function declaration Action Rules for Parameters, Traversing after parsing is done using "program", Compiler validation using examples
- Action Rules for Function Calls -> Visualizing the Problem,AST nodes and creation functions, Action Rules for Function Calls, Running the compiler
- Datatype attribute for Expressions -> Visualizing the Problem, Parameter datatype, Changes in AST Nodes and creation functions, running the compiler
- Type Checking for Assignments -> Simplify pointer and array rules, advanced type error message, assignment type check, some stuff around parameters
Final words | Next up on the project
And this is actually it for today's post! I hope that I explained everything as much as I needed to, meaning that you learned something out of it.Next up on this series are:
- Semantic analysis (using even more action rules in Bison)
- Machine Code generation (MIPS Assembly)
So, see ya next time!
GitHub Account:
https://github.com/drifter1Keep on drifting! ;)
Thank you for your contribution @drifter1.
The section where you put links from previous tutorials, it may be ideal to decrease the amount of links.
Their print screens are quite original and well exemplified. Good job!!!
As usual to bring good tutorials and very well explained. Thanks for your hard work building this tutorial.
Your contribution has been evaluated according to Utopian policies and guidelines, as well as a predefined set of questions pertaining to the category.
To view those questions and the relevant answers related to your post, click here.
Need help? Write a ticket on https://support.utopian.io/.
Chat with us on Discord.
[utopian-moderator]
Thank you for your review, @portugalcoin! Keep up the good work!
Hey, @drifter1!
Thanks for contributing on Utopian.
We’re already looking forward to your next contribution!
Get higher incentives and support Utopian.io!
Simply set @utopian.pay as a 5% (or higher) payout beneficiary on your contribution post (via SteemPlus or Steeditor).
Want to chat? Join us on Discord https://discord.gg/h52nFrV.
Vote for Utopian Witness!
Hi @drifter1!
Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your UA account score is currently 3.502 which ranks you at #6383 across all Steem accounts.
Your rank has improved 54 places in the last three days (old rank 6437).
In our last Algorithmic Curation Round, consisting of 214 contributions, your post is ranked at #187.
Evaluation of your UA score:
Feel free to join our @steem-ua Discord server