简体   繁体   English

如何在发布外部“ C” fn中返回动态长度的向量?

[英]How do I return an vector of dynamic length in a pub extern “C” fn?

I want to return a vector in a pub extern "C" fn . 我想在pub extern "C" fn返回一个向量。 Since a vector has an arbitrary length, I guess I need to return a struct with 由于向量具有任意长度,我想我需要返回一个结构

  1. the pointer to the vector, and 指向向量的指针,以及

  2. the number of elements in the vector 向量中元素的数量

My current code is: 我当前的代码是:

extern crate libc;
use self::libc::{size_t, int32_t, int64_t};

// struct to represent an array and its size
#[repr(C)]
pub struct array_and_size {
    values: int64_t, // this is probably not how you denote a pointer, right?
    size: int32_t,
}

// The vector I want to return the address of is already in a Boxed struct, 
// which I have a pointer to, so I guess the vector is on the heap already. 
// Dunno if this changes/simplifies anything?
#[no_mangle]
pub extern "C" fn rle_show_values(ptr: *mut Rle) -> array_and_size {
    let rle = unsafe {
        assert!(!ptr.is_null());
        &mut *ptr
    };

    // this is the Vec<i32> I want to return 
    // the address and length of
    let values = rle.values; 
    let length = values.len();

    array_and_size {
       values: Box::into_raw(Box::new(values)),
       size: length as i32,
       }
}

#[derive(Debug, PartialEq)]
pub struct Rle {
    pub values: Vec<i32>,
}

The error I get is 我得到的错误是

$ cargo test
   Compiling ranges v0.1.0 (file:///Users/users/havpryd/code/rust-ranges)
error[E0308]: mismatched types
  --> src/rle.rs:52:17
   |
52 |         values: Box::into_raw(Box::new(values)),
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected i64, found *-ptr
   |
   = note: expected type `i64`
   = note:    found type `*mut std::vec::Vec<i32>`

error: aborting due to previous error

error: Could not compile `ranges`.

To learn more, run the command again with --verbose.
-> exit code: 101

I posted the whole thing because I could not find an example of returning arrays/vectors in the eminently useful Rust FFI Omnibus . 我发布了整本书,因为在极其有用的Rust FFI Omnibus中找不到返回数组/向量的示例。

Is this the best way to return a vector of unknown size from Rust? 这是从Rust返回未知大小的向量的最佳方法吗? How do I fix my remaining compile error? 如何解决剩余的编译错误? Thanks! 谢谢!

Bonus q: if the fact that my vector is in a struct changes the answer, perhaps you could also show how to do this if the vector was not in a Boxed struct already (which I think means the vector it owns is on the heap too)? 奖励q:如果我的向量在结构中的事实改变了答案,也许您还可以显示如果向量尚未在Boxed结构中的话,该如何做(我认为这也意味着它拥有的向量也在堆中) )? I guess many people looking up this q will not have their vectors boxed already. 我猜很多人在查询这个q时都不会将其向量装箱。

Bonus q2: I only return the vector to view its values (in Python), but I do not want to let the calling code change the vector. 奖励q2:我只返回向量以查看其值(在Python中),但我不想让调用代码更改向量。 But I guess there is no way to make the memory read-only and ensure the calling code does not fudge with the vector? 但是我想没有办法将内存设为只读,并确保调用代码不会与向量混淆吗? const is just for showing intent, right? const仅用于显示意图,对吗?

Ps: I do not know C or Rust well, so my attempt might be completely WTF. 附:我不太了解C或Rust,所以我的尝试可能完全是WTF。

pub struct array_and_size {
    values: int64_t, // this is probably not how you denote a pointer, right?
    size: int32_t,
}

First of all, you're correct. 首先,你是对的。 The type you want for values is *mut int32_t . 您想要的values类型是*mut int32_t

In general, and note that there are a variety of C coding styles, C often doesn't "like" returning ad-hoc sized array structs like this. 通常,请注意,有多种C编码样式,C常常不喜欢这样返回特定大小的数组结构。 The more common C API would be 更常见的C API将是

int32_t rle_values_size(RLE *rle);
int32_t *rle_values(RLE *rle);

(Note: many internal programs do in fact use sized array structs, but this is by far the most common for user-facing libraries because it's automatically compatible with the most basic way of representing arrays in C). (注意:实际上,许多内部程序确实使用大小数组结构,但这是面向用户的库的最常见方法,因为它自动与C语言中表示数组的最基本方式兼容)。

In Rust, this would translate to: 在Rust中,这将转换为:

extern "C" fn rle_values_size(rle: *mut RLE) -> int32_t
extern "C" fn rle_values(rle: *mut RLE) -> *mut int32_t

The size function is straightforward, to return the array, simply do size函数很简单,要返回数组,只需执行

extern "C" fn rle_values(rle: *mut RLE) -> *mut int32_t {
    unsafe { &mut (*rle).values[0] }
}

This gives a raw pointer to the first element of the Vec 's underlying buffer, which is all C-style arrays really are. 这给出了指向Vec底层缓冲区的第一个元素的原始指针,这实际上是所有C样式的数组。

If, instead of giving C a reference to your data you want to give C the data, the most common option would be to allow the user to pass in a buffer that you clone the data into: 如果不是要给 C引用数据,而是要给 C数据,最常见的选择是允许用户传递将数据克隆到的缓冲区:

extern "C" fn rle_values_buf(rle: *mut RLE, buf: *mut int32_t, len: int32_t) {
    use std::{slice,ptr}
    unsafe {
        // Make sure we don't overrun our buffer's length
        if len > (*rle).values.len() {
           len = (*rle).values.len()
        }
        ptr::copy_nonoverlapping(&(*rle).values[0], buf, len as usize);
    }
}

Which, from C, looks like 从C看起来像

void rle_values_buf(RLE *rle, int32_t *buf, int32_t len);

This (shallowly) copies your data into the presumably C-allocated buffer, which the C user is then responsible for destroying. 这(浅)将您的数据复制到可能由C分配的缓冲区中,然后由C用户负责销毁该缓冲区。 It also prevents multiple mutable copies of your array from floating around at the same time (assuming you don't implement the version that returns a pointer). 它还可以防止数组的多个可变副本同时浮动(假设您未实现返回指针的版本)。

Note that you could sort of "move" the array into C as well, but it's not particularly recommended and involves the use mem::forget and expecting the C user to explicitly call a destruction function, as well as requiring both you and the user to obey some discipline that may be difficult to structure the program around. 请注意,您也可以将数组“移动”到C中,但是不建议这样做,它涉及使用mem::forget并期望C用户显式调用销毁函数,同时要求您和用户遵守一些可能很难围绕程序进行构造的纪律。

If you want to receive an array from C, you essentially just ask for both a *mut i32 and i32 corresponding to the buffer start and length. 如果要从C 接收数组,则实际上只需要输入*mut i32i32 *mut i32i32对应于缓冲区的开始和长度。 You can assemble this into a slice using the from_raw_parts function, and then use the to_vec function to create an owned Vector containing the values allocated from the Rust side. 您可以使用from_raw_parts函数将其组合成一个切片,然后使用to_vec函数创建一个拥有的Vector,其中包含从Rust侧分配的值。 If you don't plan on needing to own the values, you can simply pass around the slice you produced via from_raw_parts . 如果您不打算拥有这些值,则可以简单地通过from_raw_parts传递您生成的切片。

However, it is imperative that all values be initialized from either side, typically to zero. 但是,必须从任一侧初始化所有值,通常初始化为零。 Otherwise you invoke legitimately undefined behavior which often results in segmentation faults (which tend to frustratingly disappear when inspected with GDB). 否则,您将调用合法的未定义行为,这通常会导致分段错误(在使用GDB进行检查时,这些错误通常会令人沮丧地消失)。

There are multiple ways to pass an array to C. 有多种将数组传递给C的方法。


First of all, while C has the concept of fixed-size arrays ( int a[5] has type int[5] and sizeof(a) will return 5 * sizeof(int) ), it is not possible to directly pass an array to a function or return an array from it. 首先,尽管C 具有固定大小数组的概念( int a[5]具有int[5]类型,并且sizeof(a)将返回5 * sizeof(int) ),但无法直接传递数组到函数或从中返回数组。

On the other hand, it is possible to wrap a fixed size array in a struct and return that struct . 另一方面,可以将固定大小的数组包装在struct并返回该struct

Furthermore, when using an array, all elements must be initialized, otherwise a memcpy technically has undefined behavior (as it is reading from undefined values) and valgrind will definitely report the issue. 此外,在使用数组时,必须初始化所有元素,否则, memcpy技术上具有未定义的行为(因为它正在从未定义的值读取),并且valgrind肯定会报告该问题。


Using a dynamic array 使用动态数组

A dynamic array is an array whose length is unknown at compile-time. 动态数组是在编译时长度未知的数组。

One may chose to return a dynamic array if no reasonable upper-bound is known, or this bound is deemed too large for passing by value. 如果不知道合理的上限,或者认为该界限太大而无法按值传递,则可以选择返回动态数组。

There are two ways to handle this situation: 有两种方法可以处理这种情况:

  • ask C to pass a suitably sized buffer 要求C传递适当大小的缓冲区
  • allocate a buffer and return it to C 分配一个缓冲区并将其返回给C

They differ in who allocates the memory: the former is simpler, but may require to either have a way to hint at a suitable size or to be able to "rewind" if the size proves unsuitable. 它们在分配内存的人方面有所不同:前者比较简单,但可能需要提示一种合适的大小,或者在大小不合适的情况下“倒带”。

Ask C to pass a suitable sized buffer 要求C传递适当大小的缓冲区

// file.h
int rust_func(int32_t* buffer, size_t buffer_length);

// file.rs
#[no_mangle]
pub extern fn rust_func(buffer: *mut libc::int32_t, buffer_length: libc::size_t) -> libc::c_int {
    // your code here
}

Note the existence of std::slice::from_raw_parts_mut to transform this pointer + length into a mutable slice (do initialize it with 0s before making it a slice or ask the client to). 请注意存在std::slice::from_raw_parts_mut可以将指针+长度转换为可变切片(在将其设为切片之前,请先将其初始化为0或要求客户端)。

Allocate a buffer and return it to C 分配一个缓冲区并将其返回给C

// file.h
struct DynArray {
    int32_t* array;
    size_t length;
}

DynArray rust_alloc();
void rust_free(DynArray);

// file.rs
#[repr(C)]
struct DynArray {
    array: *mut libc::int32_t,
    length: libc::size_t,
}

#[no_mangle]
pub extern fn rust_alloc() -> DynArray {
    let mut v: Vec<i32> = vec!(...);

    let result = DynArray {
        array: v.as_mut_ptr(),
        length: v.len() as _,
    };

    std::mem::forget(v);

    result
}

#[no_mangle]
pub extern fn rust_free(array: DynArray) {
    if !array.array.is_null() {
        unsafe { Box::from_raw(array.array); }
    }
}

Using a fixed-size array 使用固定大小的数组

Similarly, a struct containing a fixed size array can be used. 类似地,可以使用包含固定大小数组的struct Note that both in Rust and C all elements should be initialized, even if unused; 注意,在Rust和C中,即使未使用,所有元素都应初始化。 zeroing them works well. 将它们清零效果很好。

Similarly to the dynamic case, it can be either passed by mutable pointer or returned by value. 与动态情况类似,它可以通过可变指针传递,也可以由值返回。

// file.h
struct FixedArray {
    int32_t array[32];
};

// file.rs
#[repr(C)]
struct FixedArray {
    array: [libc::int32_t; 32],
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM