简体   繁体   中英

Type checking of infix operators in compiler

I'm writing a compiler (in Haskell) and in the grammar of the language there are rules to add infix operators (addition is used as an example):

EAdd . Expr ::= Expr "+" Expr

which means EAdd is an expression, it's consist of expression, string "+" and another expression.

Parser returns abstract syntax tree (AST):

data Expr = ... | EAdd Expr Expr

I want to make a typechecker if checks that calls of the functions are given arguments of correct types.

Note, that "+" is a function that takes two integers and returns an integer. Other operators are similar.

At the moment I came up with three approaches to typechecking EAdd , all of them include adding "+" as a function to the initial symbol table:

  1. Declare that infix plus is syntax sugar for calling function "+" with two arguments. Put "desugarizer" which converts AST from parser into another data type (without EAdd ) in between parser and typechecker.

  2. (similar to the first) Declare that infix plus is syntax sugar, but desugarizer uses same AST data type. Typechecker returns an error when it's given EAdd .

  3. Inline "desugarizer" into typechecker. Similar to this:

     ... typecheck (EAdd ab) = typecheck (ECall infixPlus [a, b]) ... 

Note, that all binary infix operators are subject to this (other arithmetic, boolean operations, comparison operators).

It seems that first approach is the correct one. But it means that later in the compiler pipeline, particularly in the code generator , those ECalls should be handled as special cases, because in the compilers output (in my case — llvm) these functions are supposed to be inlined (unlike usual function calls). It means that codegen has a list of functions whose calls are be handled differently from other function calls.

What is the best approach to this issue?

UPD

How this similar issue is handled in Haskell (from https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Renamer ):

... renamer does the following things:

  • Sort out fixities. The parser parses all infix applications as left-associative, regardless of fixity. For example "a + b * c" is parsed as "(a + b) * c". The renamer re-associates such nested operator applications, using the fixities declared in the module.

LLVM supports inline attributes eg

define void @f() alwaysinline { ... }

so one option is to treat + as a normal function call and let LLVM do its optimization job.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM