Back

C4 compiler

Disclaimer: I worked in a group for this project. Therefore the work is evenly distributed among the team members.

Motivation

One thing led to another, and I ended up thinking about how I never dabbled with anything related to compilers. Then, during the first semester of my Master’s studies, I was lucky enough to join a course called “Compiler Construction.”

The Project

A compiler requires a lexer, parser, an abstract tree, some intermediate representation and the code translation into some backend. During the parsing phase, we needed to do some syntactic analysis. Most of the syntax errors are handled here. Since parsing is also responsible for tree construction, we have to carry out the semantic analysis of the code.

These are the most boring part of the compiler which is a lot of grunt work and testing. The actual fun starts when we have the AST and we convert it into some abstract form such as SSA form. After that, we can do a lot of different optimization passes. As part of the project, we had to implement SCCP optimization for the compiler in LLVM IR. Since LLVM IR already supports the SSA form, it was relatively easy to convert the types into the LLVM types and run the pass there.

image

The experience of writing the first optimization pass and making it work is quite thrilling. It was seeing all the hard work pay off! But compiler implementation is a delicate process. The golden rule of the compiler is to reduce and optimize the code for either size or speed without affecting the semantics. Thus, most of the reasoning in the compiler design goes behind handling how the user code works. As a result, behind each compiler pass is some rigorous mathematical proof that validates the user process.

In the end, we had a working toy compiler. But the bonus fact was the enhanced technical skills and problem-solving that I acquired when I had to write code, debug and design different parts of the project.