简体   繁体   English

编译器中infix运算符的类型检查

[英]Type checking of infix operators in compiler

I'm writing a compiler (in Haskell) and in the grammar of the language there are rules to add infix operators (addition is used as an example): 我正在编写一个编译器(在Haskell中),在该语言的语法中,有一些规则可以添加中缀运算符(以加法为例):

EAdd . Expr ::= Expr "+" Expr

which means EAdd is an expression, it's consist of expression, string "+" and another expression. 这意味着EAdd是一个表达式,它由表达式,字符串"+"和另一个表达式组成。

Parser returns abstract syntax tree (AST): 解析器返回抽象语法树(AST):

data Expr = ... | EAdd Expr Expr

I want to make a typechecker if checks that calls of the functions are given arguments of correct types. 如果要检查是否为函数的调用指定了正确类型的参数,我想做一个类型检查器。

Note, that "+" is a function that takes two integers and returns an integer. 注意,“ +”是一个接受两个整数并返回一个整数的函数。 Other operators are similar. 其他运算符类似。

At the moment I came up with three approaches to typechecking EAdd , all of them include adding "+" as a function to the initial symbol table: 目前,我想出了三种方法来对EAdd ,所有方法都包括将“ +”作为函数添加到初始符号表中:

  1. Declare that infix plus is syntax sugar for calling function "+" with two arguments. 声明infix plus是用于调用带有两个参数的函数“ +”的语法糖。 Put "desugarizer" which converts AST from parser into another data type (without EAdd ) in between parser and typechecker. 在解析器和EAdd之间放置“ desugarizer”,它将AST从解析器转换为另一种数据类型(没有EAdd )。

  2. (similar to the first) Declare that infix plus is syntax sugar, but desugarizer uses same AST data type. (与第一个类似)声明infix plus是语法糖,但是desugarizer使用相同的AST数据类型。 Typechecker returns an error when it's given EAdd . EAdd被赋予EAdd时,返回一个错误。

  3. Inline "desugarizer" into typechecker. 将“脱糖器”内联到typechecker中。 Similar to this: 与此类似:

     ... typecheck (EAdd ab) = typecheck (ECall infixPlus [a, b]) ... 

Note, that all binary infix operators are subject to this (other arithmetic, boolean operations, comparison operators). 请注意,所有二进制中缀运算符均受此约束(其他算术,布尔运算,比较运算符)。

It seems that first approach is the correct one. 似乎第一种方法是正确的。 But it means that later in the compiler pipeline, particularly in the code generator , those ECalls should be handled as special cases, because in the compilers output (in my case — llvm) these functions are supposed to be inlined (unlike usual function calls). 但这意味着以后在编译器管道中,尤其是在代码生成器中 ,应将这些ECalls作为特殊情况处理,因为在编译器输出(在我的情况下为llvm)中,这些函数应该被内联(与通常的函数调用不同) 。 It means that codegen has a list of functions whose calls are be handled differently from other function calls. 这意味着codegen具有一系列函数,这些函数的调用与其他函数调用的处理方式不同。

What is the best approach to this issue? 解决此问题的最佳方法是什么?

UPD UPD

How this similar issue is handled in Haskell (from https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Renamer ): Haskell中类似问题的处理方式(来自https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Renamer ):

... renamer does the following things: ...重命名器执行以下操作:

  • Sort out fixities. 整理固定装置。 The parser parses all infix applications as left-associative, regardless of fixity. 解析器将所有中缀应用程序解析为左关联,而不管其固定性如何​​。 For example "a + b * c" is parsed as "(a + b) * c". 例如,“ a + b * c”被解析为“(a + b)* c”。 The renamer re-associates such nested operator applications, using the fixities declared in the module. 重命名器使用模块中声明的固定性来重新关联此类嵌套的运算符应用程序。

LLVM supports inline attributes eg LLVM支持内联属性,例如

define void @f() alwaysinline { ... }

so one option is to treat + as a normal function call and let LLVM do its optimization job. 因此,一种选择是将+视为普通函数调用,并让LLVM进行优化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM