[英]Compiling Tail-Call Optimization In Mutual Recursion Across C and Haskell
I'm experimenting with the foreign-function interface in Haskell. 我正在试验Haskell中的外部函数接口。 I wanted to implement a simple test to see if I could do mutual recursion.
我想实现一个简单的测试,看看我是否可以进行相互递归。 So, I created the following Haskell code:
所以,我创建了以下Haskell代码:
module MutualRecursion where
import Data.Int
foreign import ccall countdownC::Int32->IO ()
foreign export ccall countdownHaskell::Int32->IO()
countdownHaskell::Int32->IO()
countdownHaskell n = print n >> if n > 0 then countdownC (pred n) else return ()
Note that the recursive case is a call to countdownC, so this should be tail-recursive. 请注意,递归情况是对countdownC的调用,因此这应该是尾递归的。
In my C code, I have 在我的C代码中,我有
#include <stdio.h>
#include "MutualRecursionHaskell_stub.h"
void countdownC(int count)
{
printf("%d\n", count);
if(count > 0)
return countdownHaskell(count-1);
}
int main(int argc, char* argv[])
{
hs_init(&argc, &argv);
countdownHaskell(10000);
hs_exit();
return 0;
}
Which is likewise tail recursive. 这同样是尾递归。 So then I make a
那我就做了
MutualRecursion: MutualRecursionHaskell_stub
ghc -O2 -no-hs-main MutualRecursionC.c MutualRecursionHaskell.o -o MutualRecursion
MutualRecursionHaskell_stub:
ghc -O2 -c MutualRecursionHaskell.hs
and compile with make MutualRecursion
. 并使用
make MutualRecursion
编译。
And... upon running, it segfaults after printing 8991
. 并且......在运行时,它会在打印
8991
。 Just as a test to make sure gcc itself can handle tco in mutual recursion, I did 就像确保gcc本身可以在相互递归中处理tco的测试一样,我做到了
void countdownC2(int);
void countdownC(int count)
{
printf("%d\n", count);
if(count > 0)
return countdownC2(count-1);
}
void countdownC2(int count)
{
printf("%d\n", count);
if(count > 0)
return countdownC(count-1);
}
and that worked quite fine. 这工作得很好。 It also works in the single-recursion case of just in C and just in Haskell.
它也适用于C语言和Haskell中的单递归情况。
So my question is, is there a way to indicate to GHC that the call to the external C function is tail recursive? 所以我的问题是,有没有办法向GHC表明对外部C函数的调用是尾递归的? I'm assuming that the stack frame does come from the call from Haskell to C and not the other way around, since the C code is very clearly a return of a function call.
我假设堆栈帧确实来自从Haskell到C的调用,而不是相反,因为C代码非常明显地是函数调用的返回。
I believe cross-language C-Haskell tail calls are very, very hard to achieve. 我相信跨语言的C-Haskell尾调用非常非常难以实现。
I do not know the exact details, but the C runtime and the Haskell runtime are vastly different. 我不知道确切的细节,但C运行时和Haskell运行时有很大的不同。 The main factors for this difference, as far as I can see, are:
据我所知,这种差异的主要因素是:
The kinds of optimizations which are likely to survive across language boundaries given such differences are next to zero. 鉴于这种差异,可能在语言边界存活的优化种类几乎为零。 Perhaps, in theory, one could invent an ad hoc C runtime together with a Haskell runtime so that some optimizations are feasible, but GHC and GCC were not designed in this way.
理论上,也许可以发明一个特殊的C运行时和Haskell运行时,以便一些优化是可行的,但GHC和GCC并不是以这种方式设计的。
Just to show an example of the potential differences, assume we have the following Haskell code 为了展示潜在差异的示例,假设我们有以下Haskell代码
p :: Int -> Bool
p x = x==42
main = if p 42
then putStrLn "A" -- A
else putStrLn "B" -- B
A possible implementation of the main
could be the following: 一可能实施方案的
main
可能是以下几点:
A
on the stack A
的地址 B
on the stack B
的地址 42
on the stack 42
推到堆栈上 p
p
A
: print "A", jump to end A
:打印“A”,跳到最后 B
: print "B", jump to end B
:打印“B”,跳到最后 while p
is implemented as follows: p
实现如下:
x
from the stack x
b
from stack b
a
from stack a
x
against 42 x
对42 a
a
b
b
Note how p
is invoked with two return addresses, one for each possible result. 注意如何使用两个返回地址调用
p
,每个可能的结果一个。 This is different from C, whose standard implementations use only one return address. 这与C不同,C的标准实现仅使用一个返回地址。 When crossing boundaries the compiler must account for this difference and compensate.
跨越边界时,编译器必须考虑到这种差异并进行补偿。
Above I also did not account for the case when the argument of p
is a thunk, to keep it simple. 上面我也没有说明当
p
的参数是thunk的情况时,为了保持简单。 The GHC allocator can also trigger garbage collection. GHC分配器还可以触发垃圾收集。
Note that the above fictional implementation was actually used in the past by GHC (the so called "push/enter" STG machine). 注意,上面的虚构实现过去实际上是由GHC(所谓的“推/输”STG机器)使用的。 Even if now it is no longer in use, the "eval/apply" STG machine is only marginally closer to the C runtime.
即使现在不再使用它,“eval / apply”STG机器也只是稍微靠近C运行时。 I'm not even sure about GHC using the regular C stack: I think it does not, using its own one.
我甚至不确定使用常规C堆栈的GHC:我认为它没有使用它自己的。
You can check the GHC developer wiki to see the gory details. 您可以查看GHC开发人员维基以查看血淋淋的详细信息。
While I am no expert in Haskel-C interop, I do not imagine a call from C to Haskel can be a straight function invocation - it most likely has to go through intermediary to set up environment. 虽然我不是Haskel-C互操作的专家,但我不认为从C到Haskel的调用可以是一个直接的函数调用 - 它很可能必须通过中介来设置环境。 As a result, your call to haskel would actually consist of call to this intermediary.
因此,您对haskel的调用实际上包括调用此中间人。 This call likely was optimized by gcc.
这个电话很可能是由gcc优化的。 But the call from intermediary to actual Haskel routine was not neccessarily optimized - so I assume, this is what you are dealing with.
但是从中间人到实际的Haskel例程的调用并没有被完美地优化 - 所以我认为,这就是你正在处理的事情。 You can check assembly output to make sure.
您可以检查装配输出以确保。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.