如何通过标准输入实现阻塞迭代器？

Question

I need to implement a long-running program that receives messages via stdin.我需要实现一个通过标准输入接收消息的长时间运行的程序。 The protocol defines that messages are in form of length indicator (for simplicity 1 byte integer) and then string of a length represented by length indicator.该协议定义消息的形式为长度指示符（为简单起见，1字节整数），然后是由长度指示符表示的长度的字符串。 Messages are NOT separated by any whitespace.消息不由任何空格分隔。 The program is expected to consume all messages from stdin and wait for another messages.该程序预计会消耗来自 stdin 的所有消息并等待另一条消息。

How do I implement such waiting on stdin?我如何在标准输入上实现这种等待？

I implemented the iterator in a way that it tries to read from stdin and repeats in case of error.我以一种尝试从标准输入读取并在出现错误时重复的方式实现了迭代器。 It works, but it is very inefficient.它有效，但效率非常低。 I would like the iterator to read the message when new data comes.我希望迭代器在新数据到来时读取消息。

My implementation is using read_exact :我的实现是使用read_exact ：

use std::io::{Read, stdin, Error as IOError, ErrorKind};

pub struct In<R>(R) where R: Read;

pub trait InStream{
    fn read_one(&mut self) -> Result<String, IOError>;
}

impl <R>In<R> where R: Read{
    pub fn new(stdin: R) -> In<R> {
        In(stdin)
    }
}

impl <R>InStream for In<R> where R: Read{
    /// Read one message from stdin and return it as string
    fn read_one(&mut self) -> Result<String, IOError>{

        const length_indicator: usize = 1;
        let stdin = &mut self.0;

        let mut size: [u8;length_indicator] = [0; length_indicator];
        stdin.read_exact(&mut size)?;
        let size = u8::from_be_bytes(size) as usize;

        let mut buffer = vec![0u8; size];
        let _bytes_read = stdin.read_exact(&mut buffer);
        String::from_utf8(buffer).map_err(|_| IOError::new(ErrorKind::InvalidData, "not utf8"))
    }
}
impl <R>Iterator for In<R> where R:Read{
    type Item = String;
    fn next(&mut self) -> Option<String>{
        self.read_one()
            .ok()
    }
}

fn main(){
    let mut in_stream = In::new(stdin());
    loop{
        match in_stream.next(){
            Some(x) => println!("x: {:?}", x),
            None => (),
        }
    }
}

I went trough Read and BufReader documentation, but none method seems to solve my problem as read doc contains following text:我浏览了 Read 和 BufReader 文档，但似乎没有任何方法可以解决我的问题，因为read doc 包含以下文本：

This function does not provide any guarantees about whether it blocks waiting for data, but if an object needs to block for a read and cannot, it will typically signal this via an Err return value.这个函数不提供任何关于它是否阻塞等待数据的保证，但是如果一个对象需要阻塞以进行读取并且不能，它通常会通过一个 Err 返回值来发出信号。

How do I implement waiting for data on stdin?如何在标准输入上实现等待数据？

=== ===

Edit: minimum use-case that does not block and loops giving UnexpectedEof error instead of waiting for data:编辑：不阻塞和循环给出 UnexpectedEof 错误而不是等待数据的最小用例：

use std::io::{Read, stdin};
fn main(){
    let mut stdin = stdin();
    let mut stdin_handle = stdin.lock();
    loop{
        let mut buffer = vec![0u8; 4];
        let res = stdin_handle.read_exact(&mut buffer);
        println!("res: {:?}", res);
        println!("buffer: {:?}", buffer);
    }

I run it on OSX by cargo run < in where in is named pipe.我通过cargo run < in where in is named pipe 在 OSX 上运行它。 I fill the pipe by echo -n "1234" > in .我通过echo -n "1234" > in填充管道。

It waits for the first input and then it loops.它等待第一个输入，然后循环。

res: Ok(())
buffer: [49, 50, 51, 52]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
buffer: [0, 0, 0, 0]
res: Err(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
...

I would like the program to wait until there is sufficient data to fill the buffer.我希望程序等到有足够的数据来填充缓冲区。

Answer 1

As others explained, the docs on Read are written very generally and don't apply to standard input, which is blocking.正如其他人所解释的，关于Read的文档写得很笼统，不适用于标准输入，这是阻塞的。 In other words, your code with the buffering added is fine.换句话说，添加了缓冲的代码很好。

The problem is how you use the pipe.问题是你如何使用管道。 For example, if you run mkfifo foo; cat <foo例如，如果您运行mkfifo foo; cat <foo mkfifo foo; cat <foo in one shell, and echo -n bla >foo in another, you'll see that the cat in the first shell will display foo and exit. mkfifo foo; cat <foo在一个 shell 中，而echo -n bla >foo在另一个 shell 中，你会看到第一个 shell 中的cat将显示foo并退出。 That closing the last writer of the pipe sends EOF to the reader, rendering your program's stdin useless.关闭管道的最后一个写入器会将 EOF 发送给读取器，从而使程序的stdin无用。

You can work around the issue by starting another program in the background that opens the pipe in write mode and never exits, for example tail -f /dev/null >pipe-filename .您可以通过在后台启动另一个程序来解决此问题，该程序以写入模式打开管道并且永不退出，例如tail -f /dev/null >pipe-filename 。 Then echo -n bla >foo will be observed by your program, but won't cause its stdin to close.然后echo -n bla >foo将被您的程序观察到，但不会导致其标准输入关闭。 The "holding" of the write end of the pipe could probably also be achieved from Rust as well.管道写入端的“保持”也可以通过 Rust 实现。

如何通过标准输入实现阻塞迭代器？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-20 12:48:06

如何通过标准输入实现阻塞迭代器？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-20 12:48:06

解决方案1
1 已采纳 2021-10-20 12:48:06