[英]Can you turn an Iterator<T> into an Iterator<&T> efficiently?
I've ran into two frustrating problems, it comes from wanting to send messages on a unix socket using sendmmsg from the nix-crate.我遇到了两个令人沮丧的问题,它来自想要使用 nix-crate 中的sendmmsg在 unix 套接字上发送消息。
I have some given message which may or may not contain fds.我有一些给定的消息,可能包含也可能不包含 fds。 Nix does most things zero-copy, making it sometimes tricky to work with, making you battle the borrow-checker and type system, both problems come from this function signature:
Nix 做的大部分事情都是零拷贝的,这使得它有时很难使用,让你与借用检查器和类型系统作斗争,这两个问题都来自这个 function 签名:
pub fn sendmmsg<'a, I, C, S>(
fd: RawFd,
data: impl std::iter::IntoIterator<Item=&'a SendMmsgData<'a, I, C, S>>,
flags: MsgFlags
) -> Result<Vec<usize>>
where
I: AsRef<[IoSlice<'a>]> + 'a,
C: AsRef<[ControlMessage<'a>]> + 'a,
S: SockaddrLike + 'a
Where SendMmsgData is defined as:其中 SendMmsgData 定义为:
pub struct SendMmsgData<'a, I, C, S>
where
I: AsRef<[IoSlice<'a>]>,
C: AsRef<[ControlMessage<'a>]>,
S: SockaddrLike + 'a
{
pub iov: I,
pub cmsgs: C,
pub addr: Option<S>,
pub _lt: std::marker::PhantomData<&'a I>,
}
And here's the code interfacing with it这是与它交互的代码
...
#[inline]
fn exec_write_many<M>(&mut self, messages: Vec<M>) -> Result<usize, Error>
where
M: SocketMessage,
{
let mut sent = 0;
let mut send_message_data = vec![];
for msg in messages.iter() {
let mmsg_data = if msg.borrow_fds().is_empty() {
SendMmsgData {
iov: [IoSlice::new(msg.borrow_buf()); 1],
cmsgs: vec![],
addr: NONE_ADDR,
_lt: std::marker::PhantomData::default(),
}
} else {
SendMmsgData {
iov: [IoSlice::new(msg.borrow_buf()); 1],
cmsgs: vec![ControlMessage::ScmRights(msg.borrow_fds())],
addr: NONE_ADDR,
_lt: std::marker::PhantomData::default(),
}
};
send_message_data.push(mmsg_data);
}
match nix::sys::socket::sendmmsg(self.sock_fd, &send_message_data, MsgFlags::MSG_DONTWAIT) {
...
Both problems are manageable but come at a performance cost, starting with the major one: I want to provide the sendmmsg with an iterator created like this instead:这两个问题都是可管理的,但会以性能为代价,从主要问题开始:我想为 sendmmsg 提供一个像这样创建的迭代器:
...
#[inline]
fn exec_write_many<M>(&mut self, messages: Vec<M>) -> Result<usize, Error>
where
M: SocketMessage,
{
let mut sent = 0;
let sendmmsgs = messages.iter()
.map(|msg| {
if msg.borrow_fds().is_empty() {
SendMmsgData {
iov: [IoSlice::new(msg.borrow_buf()); 1],
cmsgs: vec![],
addr: NONE_ADDR,
_lt: std::marker::PhantomData::default(),
}
} else {
SendMmsgData {
iov: [IoSlice::new(msg.borrow_buf()); 1],
cmsgs: vec![ControlMessage::ScmRights(msg.borrow_fds())],
addr: NONE_ADDR,
_lt: std::marker::PhantomData::default(),
}
}
});
match nix::sys::socket::sendmmsg(self.sock_fd, sendmmsgs, MsgFlags::MSG_DONTWAIT) {
...
But since SendMmsgData is owned by the iterator I get this:但是由于 SendMmsgData 归迭代器所有,我得到了这个:
error[E0271]: type mismatch resolving `<[closure@socks/src/buffered_writer.rs:146:18: 162:14] as FnOnce<(&M,)>>::Output == &SendMmsgData<'_, _, _, _>`
--> socks/src/buffered_writer.rs:163:56
|
163 | match nix::sys::socket::sendmmsg(self.sock_fd, sendmmsgs, MsgFlags::MSG_DONTWAIT) {
| -------------------------- ^^^^^^^^^ expected reference, found struct `SendMmsgData`
| |
| required by a bound introduced by this call
|
= note: expected reference `&SendMmsgData<'_, _, _, _>`
found struct `SendMmsgData<'_, [IoSlice<'_>; 1], Vec<ControlMessage<'_>>, ()>`
= note: required because of the requirements on the impl of `Iterator` for `Map<std::slice::Iter<'_, M>, [closure@socks/src/buffered_writer.rs:146:18: 162:14]>`
note: required by a bound in `nix::sys::socket::sendmmsg`
--> /home/gramar/.cargo/registry/src/github.com-1ecc6299db9ec823/nix-0.24.2/src/sys/socket/mod.rs:1456:40
|
1456 | data: impl std::iter::IntoIterator<Item=&'a SendMmsgData<'a, I, C, S>>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ required by this bound in `nix::sys::socket::sendmmsg`
It's pretty frustrating as for Options
for example I can just call as_ref()
to turn the inner T
to &T
but I can't figure out how to do that with the iterator which makes me allocate another vector and loop through all messages to be sent.对于
Options
来说非常令人沮丧,例如我可以调用as_ref()
将内部T
转换为&T
但我不知道如何使用迭代器来做到这一点,这使我分配另一个向量并遍历所有要发送的消息.
The second smaller issue is the cmsgs
.第二个较小的问题是
cmsgs
。 The type-system disallows using an array, since one branch will have the type [_;1]
and the other [_;0]
.类型系统不允许使用数组,因为一个分支将具有类型
[_;1]
和另一个[_;0]
。 The empty Vec causes no allocations, while the vec with one item will cause an allocation.空的 Vec 不会导致分配,而带有一项的 vec 将导致分配。
Both problems have me running into the same issue.这两个问题让我遇到了同样的问题。 I can't figure out how to create a wrapper struct and just implement
IntoIterator<Item=&'a SendMmsgData<'a, I, C, S>>
and AsRef<[ControlMessage<'a>]>
respectively because both requires me to return a reference that will be created in the function body, as my datastructure is not on the form SendMmsgData
or ControlMessage
, and both of those reference some other piece of memory, in my case I would have to create a struct with an owned buffer and an internal reference to that (internal reference to self) which creates other problems.我不知道如何创建一个包装结构,只实现
IntoIterator<Item=&'a SendMmsgData<'a, I, C, S>>
和AsRef<[ControlMessage<'a>]>
因为两者都需要我返回将在 function 主体中创建的引用,因为我的数据结构不在SendMmsgData
或ControlMessage
的形式上,并且这两个都引用了 memory 的其他部分,在我的情况下,我将不得不创建一个具有自有缓冲区的结构以及对那个产生其他问题的内部引用(对自我的内部引用)。
Any ideas of how I can do this without the extra loop/allocations?关于如何在没有额外循环/分配的情况下做到这一点的任何想法?
Ps.附言。 On measurements, doing this is around a 10% performance hit when messages do not have fds because of how that syscall works, with a 2% performance increase on the case of only having messages with fds:
在测量中,由于系统调用的工作方式,当消息没有 fds 时,这样做会降低大约 10% 的性能,而在只有带有 fds 的消息的情况下,性能会提高 2%:
if msg.borrow_fds().is_empty() {
SendMmsgData {
iov: [IoSlice::new(msg.borrow_buf()); 1],
cmsgs: [ControlMessage::ScmRights(&[])],
addr: NONE_ADDR,
_lt: std::marker::PhantomData::default(),
}
} else {
SendMmsgData {
iov: [IoSlice::new(msg.borrow_buf()); 1],
cmsgs: [ControlMessage::ScmRights(msg.borrow_fds())],
addr: NONE_ADDR,
_lt: std::marker::PhantomData::default(),
}
};
I could just use the libc crate directly, and if I can't solve this in a better way I'll have to do that instead.我可以直接使用 libc crate,如果我不能以更好的方式解决这个问题,我将不得不这样做。
The trouble with your first problem is that sendmmsg
could call next
on the iterator n
times, getting references to n
SendMsgData
s that all have to live somewhere.第一个问题的问题是
sendmmsg
可以在迭代器上调用next
n
次,获取对n
SendMsgData
的引用,这些引用都必须存在于某个地方。 Because you can't know what n
is, all SendMsgData
s have to live somewhere, so you'll have to buffer them in a Vec
.因为您不知道
n
是什么,所以所有SendMsgData
都必须存在于某个地方,因此您必须将它们缓冲在Vec
中。 This could be fixed by changing the API of sendmmsg
to take either owned or borrowed SendMsgData
s, but you obviously don't have control over that.这可以通过更改 sendmmsg 的
sendmmsg
以采用拥有或借用的SendMsgData
来解决,但您显然无法控制它。
The cmsgs
issue, I think, can be helped though.我认为
cmsgs
问题可以得到帮助。 You can create your own Option
-like wrapper that lives purely on the stack and that implements AsRef
based on whether it contains a value or not:您可以创建自己的
Option
类包装器,该包装器纯粹存在于堆栈中,并根据其是否包含值来实现AsRef
:
struct ControlMessage<'a>(std::marker::PhantomData<&'a ()>);
enum CMsgWrapper<'a> {
Empty,
Msg(ControlMessage<'a>),
}
impl<'a> AsRef<[ControlMessage<'a>]> for CMsgWrapper<'a> {
fn as_ref(&self) -> &[ControlMessage<'a>] {
match self {
CMsgWrapper::Empty => &[],
CMsgWrapper::Msg(cmsg) => std::slice::from_ref(cmsg),
}
}
}
Like in the accepted answer was said, your temporary struct should live somewhere.就像在接受的答案中所说的那样,您的临时结构应该存在于某个地方。 As I see, you accept input parameter by value, so you can probably modify them.
如我所见,您按值接受输入参数,因此您可以修改它们。 Yes, I propose some "dirty" solution that does not look good even for me, but when performance really matters, we can try it.
是的,我提出了一些“肮脏”的解决方案,即使对我来说也不好看,但是当性能真的很重要时,我们可以尝试一下。
So the idea is to put SendMmsgData
structs into your message parameters as Option<SendMmsgData>
and make SocketMessage
trait have fn get_send_mmsg_data(&mut self) -> &SendMmsgData
.所以想法是将
SendMmsgData
结构作为Option<SendMmsgData>
放入您的消息参数中,并使SocketMessage
特征具有fn get_send_mmsg_data(&mut self) -> &SendMmsgData
。
And yes, it's better to make PR into rust std library to chane sendmmsg
interface making it accept references.是的,最好将 PR 放入 rust 标准库以
sendmmsg
接口,使其接受引用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.