简体   繁体   English

使用malloced字符串创建Rust String是否安全?

[英]Is is safe to create a Rust String using a malloced string?

I am working with a C API that returns a malloc ed string: 我正在使用一个返回malloc ed字符串的C API:

char *foo(int arg);

Can I reuse that memory in Rust code without O(n) copying? 我可以在没有O(n)复制的情况下在Rust代码中重用该内存吗?

let p: *mut libc::c_char = foo(42);
let len = strlen(p);
let s: String = String.from_raw_parts(p, len, len);

The documentation says 文件说

The memory at ptr needs to have been previously allocated by the same allocator the standard library uses. ptr的内存需要先由标准库使用的同一分配器分配。

I failed to find what allocator is used by standard library. 我没能找到标准库使用的分配器。

In general, it's not safe to create a String from a string that was not allocated from Rust. 在一般情况下,它不是安全地创建一个String从不是从锈蚀分配的字符串。

Rust 0.11.0 through 1.31.1 used jemalloc. Rust 0.11.0到1.31.1使用了jemalloc。 Rust 1.32.0 changed to use the system's default allocator. Rust 1.32.0更改为使用系统的默认分配器。

Additionally, Rust 1.28.0 introduced a mechanism that applications can use to replace the global allocator with one of their choosing. 此外, Rust 1.28.0引入了一种机制,应用程序可以使用它来替换全局分配器。

It's important to note that, although Rust now uses the system's default allocator by default, that doesn't mean that C libraries use the same allocator, even if it's literally malloc . 值得注意的是,虽然Rust现在默认使用系统的默认分配器,但这并不意味着C库使用相同的分配器,即使它实际上是malloc For example, on Windows, if you use a C library that's been compiled with Visual C++ 2008 while your Rust binary has been compiled with Visual Studio 2019 Build Tools, there will be two C runtime libraries loaded in your process: the C library will use msvcr90.dll while your Rust binary will use ucrtbase.dll . 例如,在Windows上,如果使用使用Visual C ++ 2008编译的C库,而使用Visual Studio 2019构建工具编译Rust二进制文件,则会在您的进程中加载两个 C运行时库:C库将使用msvcr90.dll,而您的Rust二进制文件将使用ucrtbase.dll Each C runtime library manages its own heap, so memory allocated by one cannot be freed by the other. 每个C运行时库都管理自己的堆,因此一个分配的内存不能被另一个释放。

A well-designed C library ought to provide a function to free resources for each type of resource that the library may allocate itself. 一个设计良好的C库应该提供一个函数来为库自己分配的每种资源释放资源。 Functions that return pointers or handles to such allocations ought to document which function should be called to free the resource(s). 返回指向此类分配的指针或句柄的函数应该记录应该调用哪个函数来释放资源。 See this other question regarding usage of LLVM's C API for an example of a well-designed API. 有关精心设计的API的示例,请参阅有关LLVM的C API使用的其他问题。

Perhaps you don't actually need a String ? 也许你真的不需要String Consider using CStr instead, if that's possible. 如果可能,请考虑使用CStr A CStr is akin to a str , so it's just a view into memory and it doesn't care how it was allocated, but it's more permissive than str . CStr类似于str ,所以它只是一个内存视图,并不关心它是如何分配的,但它比str更宽松。 You can convert a CStr to a str using CStr::to_str (the CStr must contain a UTF-8 string for the conversion to succeed). 您可以使用CStr::to_strCStr转换为strCStr必须包含UTF-8字符串才能使转换成功)。

If there is indeed a function in the library to free the string, you might also want to write a wrapper struct that will handle deallocation automatically and will deref to CStr . 如果库中确实有一个函数来释放字符串,那么您可能还想编写一个自动处理解除分配的包装器结构,并将解析为CStr This struct would represent an owned string, akin to String or CString , but with memory managed by the library instead of Rust's global allocator. 此结构将表示一个拥有的字符串,类似于StringCString ,但是由库管理的内存而不是Rust的全局分配器。 For example: 例如:

extern crate libc; // 0.2.62

use std::ffi::CStr;
use std::ops::Deref;

extern {
    fn libfoo_free(string: *mut libc::c_char);
}

struct LibfooString(*mut libc::c_char);

impl Drop for LibfooString {
    fn drop(&mut self) {
        unsafe {
            libfoo_free(self.0);
        }
    }
}

impl Deref for LibfooString {
    type Target = CStr;

    fn deref(&self) -> &Self::Target {
        unsafe {
            CStr::from_ptr(self.0)
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM