简体   繁体   English

从 Python ctypes 调用 Go 字符串 function 导致段错误

[英]Calling a Go string function from Python ctypes results in segfault

I have a module called test.go that contains two simple Go functions which accept string types:我有一个名为test.go的模块,其中包含两个接受字符串类型的简单 Go 函数:

package main

import (
  "fmt"
  "C"
)

//export TestConcat
func TestConcat(testArg string, testArg2 string) (string) {
  retval := testArg + testArg2
  return retval
}

//export TestHello
func TestHello(testArg string) {
  fmt.Println("%v\n", testArg)
}


func main(){}

I compile it as a shared library with go build -o test.so -buildmode=c-shared test.go我使用go build -o test.so -buildmode=c-shared test.go将其编译为共享库

Then I have a Python module called test.py然后我有一个名为test.py的 Python 模块

import ctypes

from ctypes import cdll


test_strings = [
    "teststring1",
    "teststring2"
]

if __name__ == '__main__':
    lib = cdll.LoadLibrary("./test.so")
    lib.TestConcat.argtypes = [ctypes.c_wchar_p, ctypes.c_wchar_p]
    lib.TestHello.argtypes = [ctypes.c_wchar_p]
    for test_string in test_strings:
        print(
            lib.TestConcat("hello", test_string)
        )
        lib.TestHello(test_string)

Then I run test.py and get a nasty segfault然后我运行test.py并得到一个讨厌的段错误

runtime: out of memory: cannot allocate 279362762964992-byte block (66781184 in use)
fatal error: out of memory

I've tried wrapping the arguments in ctypes.c_wchar_p to no avail.我尝试将 arguments 包装在ctypes.c_wchar_p中,但无济于事。

What am I doing wrong here?我在这里做错了什么? And specifically, how does one interact with Go functions that accept string arguments in Python?具体来说,如何与在 Python 中接受字符串 arguments 的 Go 函数交互?

The Go's string type is actually something like Go 的string类型实际上类似于

type string {
    ptr *byte
    size int
}

so that is what the Test{Hello|Concat} actually expect—not a pair of pointers but a pair of struct -typed values.所以这就是Test{Hello|Concat}实际期望的——不是一对指针,而是一对struct类型的值。
In other words, cgo performs just enough magic to gateway calls from Go to C and back, but it does not perform automatic conversions of values.换句话说, cgo对从 Go 到 C 以及返回的网关调用执行了足够的魔法,但它不执行值的自动转换。

You have two options:你有两个选择:

  • Explicitly work with this from your ctypes bindings, if possible.如果可能,从您的ctypes绑定中显式使用它。
    When compiling your package, cgo generates a header file which contains a C definition for the struct representing a Go string; When compiling your package, cgo generates a header file which contains a C definition for the struct representing a Go string; you could use it right away.你可以马上使用它。

  • Make the functions exported to C compatible with the C's "type system".使导出到 C 的函数与 C 的“类型系统”兼容。
    For this, cgo offers helper functions C.CString and C.GoString .为此, cgo提供了辅助函数C.CStringC.GoString
    Basically, you can define your API like this:基本上,您可以像这样定义您的 API:

     func TestHello(a, b *C.char) *C.char { testArg1, testArg2:= C.GoString(a), C.GoString(b) return C.CString(testArg + TestArg2) }

    Note few caveats here:请注意这里的一些警告:

    • Both of these helpers copy the memory of their argument, so the silly example above would work just fine but it would first duplicate the memory blocks pointed to by a and b , then eat up twice as much memory to produce the concatenated string and then copy the memory of the resulting string once again to produce the returned pointer.这两个助手都复制了他们参数的 memory,所以上面的愚蠢示例可以正常工作,但它会首先复制ab指向的 memory 块,然后吃掉两倍的 ZCD69B4957F0196CD818DZBF3D6结果字符串的 memory 再次产生返回的指针。
      IOW, this approach is fine if you're trying to export to C some big chunk of Go code so that these allocations are dwarfed by whatever that chunk does. IOW,如果您尝试将 Go 代码的一些大块导出到 C,这种方法很好,这样无论该块做什么,这些分配都相形见绌。
    • Using *C.char is the same as *char in C, so the string is expected to be NUL-terminated;使用*C.char与 C 中的*char相同,因此字符串应以 NUL 结尾; if it's not, use C.GoStringN .如果不是,请使用C.GoStringN
    • Every memory block allocated by C.CString have to be freed by a call to C.free .每个由 C.CString 分配的C.CString块必须通过调用C.free来释放。 And here's a twist: C.free is basically a thin shim to call free() from the linked in libc , so if you can guarantee the complete product (the code fully loaded into memory and (inter)linked using the dymanic linker) has only a single copy of libc linked in, you can call free() from the non-Go code on the memory blocks produced by calls to C.Cstring in the Go code.这里有一个转折: C.free基本上是一个薄垫片,可以从libc中的链接调用free() ,所以如果你能保证完整的产品(代码完全加载到 memory 并使用动态链接器(相互)链接)有只有一个链接的libc副本,您可以从C.Cstring块上的非 Go 代码调用free()

A few more random pointers:还有一些随机指针:

  • I'm not well-versed in Python's ctypes but I'd speculate using ctypes.c_wchar_p is not correct: in C (and C++, FWIW) wchar_t is a type to denote a single fixed-sized "wide character", which is usually a UCS-2/UTF-16 code point, and Go's strings are not composed of these—they may contain arbitrary bytes, and when they are used to contain Unicode text, they are encoded using UTF-8 which is a multi-byte ecoding (a single Unicode code point may be represented by 1 to 4 bytes in the string).我不精通 Python 的ctypes ,但我推测使用ctypes.c_wchar_p是不正确的:在 C(和C++ ,FWIW)中wchar_t是一种表示单个固定大小的“宽字符”的类型,一个UCS-2/UTF-16代码点,而 Go 的字符串不是由这些组成的——它们可能包含任意字节,当它们用于包含 Unicode 文本时,它们使用UTF-8进行编码,这是一种多字节编码(a单个 Unicode 码点可以用字符串中的 1 到 4 个字节表示)。
    In either case, wchar_t cannot be used for UTF-8 (and actually many seasoned devs beleive it's an abomination ).在任何一种情况下, wchar_t都不能用于 UTF-8 (实际上 许多经验丰富的开发人员认为这是可憎的)。
  • Please read the docs on cmd/cgo completely before embarking on this project.在开始这个项目之前,请完整阅读cmd/cgo上的文档 Really, please do!真的,请做!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM