在C和Haskell的相互递归中编译尾调用优化

Question

I'm experimenting with the foreign-function interface in Haskell. 我正在试验Haskell中的外部函数接口。 I wanted to implement a simple test to see if I could do mutual recursion. 我想实现一个简单的测试，看看我是否可以进行相互递归。 So, I created the following Haskell code: 所以，我创建了以下Haskell代码：

module MutualRecursion where
import Data.Int

foreign import ccall countdownC::Int32->IO ()
foreign export ccall countdownHaskell::Int32->IO()

countdownHaskell::Int32->IO()
countdownHaskell n = print n >> if n > 0 then countdownC (pred n) else return ()

Note that the recursive case is a call to countdownC, so this should be tail-recursive. 请注意，递归情况是对countdownC的调用，因此这应该是尾递归的。

In my C code, I have 在我的C代码中，我有

#include <stdio.h>

#include "MutualRecursionHaskell_stub.h"

void countdownC(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownHaskell(count-1);
}

int main(int argc, char* argv[])
{
    hs_init(&argc, &argv);

    countdownHaskell(10000);

    hs_exit();
    return 0;
}

Which is likewise tail recursive. 这同样是尾递归。 So then I make a 那我就做了

MutualRecursion: MutualRecursionHaskell_stub
    ghc -O2 -no-hs-main MutualRecursionC.c MutualRecursionHaskell.o -o MutualRecursion
MutualRecursionHaskell_stub:
    ghc -O2 -c MutualRecursionHaskell.hs

and compile with make MutualRecursion . 并使用make MutualRecursion编译。

And... upon running, it segfaults after printing 8991 . 并且......在运行时，它会在打印8991 。 Just as a test to make sure gcc itself can handle tco in mutual recursion, I did 就像确保gcc本身可以在相互递归中处理tco的测试一样，我做到了

void countdownC2(int);

void countdownC(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownC2(count-1);
}

void countdownC2(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownC(count-1);
}

and that worked quite fine. 这工作得很好。 It also works in the single-recursion case of just in C and just in Haskell. 它也适用于C语言和Haskell中的单递归情况。

So my question is, is there a way to indicate to GHC that the call to the external C function is tail recursive? 所以我的问题是，有没有办法向GHC表明对外部C函数的调用是尾递归的？ I'm assuming that the stack frame does come from the call from Haskell to C and not the other way around, since the C code is very clearly a return of a function call. 我假设堆栈帧确实来自从Haskell到C的调用，而不是相反，因为C代码非常明显地是函数调用的返回。

Answer 1

I believe cross-language C-Haskell tail calls are very, very hard to achieve. 我相信跨语言的C-Haskell尾调用非常非常难以实现。

I do not know the exact details, but the C runtime and the Haskell runtime are vastly different. 我不知道确切的细节，但C运行时和Haskell运行时有很大的不同。 The main factors for this difference, as far as I can see, are: 据我所知，这种差异的主要因素是：

different paradigm: purely functional vs imperative 不同范式：纯粹的功能与命令式
garbage collection vs manual memory management 垃圾收集与手动内存管理
lazy semantics vs strict one 懒惰的语义与严格的语义

The kinds of optimizations which are likely to survive across language boundaries given such differences are next to zero. 鉴于这种差异，可能在语言边界存活的优化种类几乎为零。 Perhaps, in theory, one could invent an ad hoc C runtime together with a Haskell runtime so that some optimizations are feasible, but GHC and GCC were not designed in this way. 理论上，也许可以发明一个特殊的C运行时和Haskell运行时，以便一些优化是可行的，但GHC和GCC并不是以这种方式设计的。

Just to show an example of the potential differences, assume we have the following Haskell code 为了展示潜在差异的示例，假设我们有以下Haskell代码

p :: Int -> Bool
p x = x==42

main = if p 42
       then putStrLn "A"     -- A
       else putStrLn "B"     -- B

A possible implementation of the main could be the following: 一可能实施方案的main可能是以下几点：

push the address of A on the stack 在堆栈上推送A的地址
push the address of B on the stack 在堆栈上推送B的地址
push 42 on the stack 将42推到堆栈上
jump to p 跳到p
A : print "A", jump to end A ：打印“A”，跳到最后
B : print "B", jump to end B ：打印“B”，跳到最后

while p is implemented as follows: p实现如下：

p: pop x from the stack p：从堆栈中弹出x
pop b from stack 从堆栈弹出b
pop a from stack 从堆栈弹出a
test x against 42 测试x对42
if equal, jump to a 如果相等，跳转到a
jump to b 跳到b

Note how p is invoked with two return addresses, one for each possible result. 注意如何使用两个返回地址调用p ，每个可能的结果一个。 This is different from C, whose standard implementations use only one return address. 这与C不同，C的标准实现仅使用一个返回地址。 When crossing boundaries the compiler must account for this difference and compensate. 跨越边界时，编译器必须考虑到这种差异并进行补偿。

Above I also did not account for the case when the argument of p is a thunk, to keep it simple. 上面我也没有说明当p的参数是thunk的情况时，为了保持简单。 The GHC allocator can also trigger garbage collection. GHC分配器还可以触发垃圾收集。

Note that the above fictional implementation was actually used in the past by GHC (the so called "push/enter" STG machine). 注意，上面的虚构实现过去实际上是由GHC（所谓的“推/输”STG机器）使用的。 Even if now it is no longer in use, the "eval/apply" STG machine is only marginally closer to the C runtime. 即使现在不再使用它，“eval / apply”STG机器也只是稍微靠近C运行时。 I'm not even sure about GHC using the regular C stack: I think it does not, using its own one. 我甚至不确定使用常规C堆栈的GHC：我认为它没有使用它自己的。

You can check the GHC developer wiki to see the gory details. 您可以查看GHC开发人员维基以查看血淋淋的详细信息。

Answer 2

While I am no expert in Haskel-C interop, I do not imagine a call from C to Haskel can be a straight function invocation - it most likely has to go through intermediary to set up environment. 虽然我不是Haskel-C互操作的专家，但我不认为从C到Haskel的调用可以是一个直接的函数调用 - 它很可能必须通过中介来设置环境。 As a result, your call to haskel would actually consist of call to this intermediary. 因此，您对haskel的调用实际上包括调用此中间人。 This call likely was optimized by gcc. 这个电话很可能是由gcc优化的。 But the call from intermediary to actual Haskel routine was not neccessarily optimized - so I assume, this is what you are dealing with. 但是从中间人到实际的Haskel例程的调用并没有被完美地优化 - 所以我认为，这就是你正在处理的事情。 You can check assembly output to make sure. 您可以检查装配输出以确保。

在C和Haskell的相互递归中编译尾调用优化

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-11-03 19:39:40

解决方案2
0 2015-11-03 18:40:35

在C和Haskell的相互递归中编译尾调用优化

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-11-03 19:39:40

解决方案2 0 2015-11-03 18:40:35

解决方案1
3 已采纳 2015-11-03 19:39:40

解决方案2
0 2015-11-03 18:40:35