[英]What is the most efficient way to read a large file in chunks without loading the entire file in memory at once?
What is the most efficient general purpose way of reading "large" files (which may be text or binary), without going into unsafe
territory?在不进入
unsafe
区域的情况下,读取“大”文件(可能是文本或二进制文件)的最有效通用方法是什么? I was surprised how few relevant results there were when I did a web search for "rust read large file in chunks".当我进行 web 搜索“rust read large file in chunks”时,我很惊讶相关结果很少。
For example, one of my use cases is to calculate an MD5 checksum for a file using rust-crypto
(the Md5
module allows you to add &[u8]
chunks iteratively).例如,我的一个用例是使用
rust-crypto
计算文件的 MD5 校验和( Md5
模块允许您迭代地添加&[u8]
块)。
Here is what I have, which seems to perform slightly better than some other methods like read_to_end
:这是我所拥有的,它的性能似乎比
read_to_end
等其他方法略好:
use std::{
fs::File,
io::{self, BufRead, BufReader},
};
fn main() -> io::Result<()> {
const CAP: usize = 1024 * 128;
let file = File::open("my.file")?;
let mut reader = BufReader::with_capacity(CAP, file);
loop {
let length = {
let buffer = reader.fill_buf()?;
// do stuff with buffer here
buffer.len()
};
if length == 0 {
break;
}
reader.consume(length);
}
Ok(())
}
I don't think you can write code more efficient than that. 我不认为你可以编写比这更高效的代码。
fill_buf
on a BufReader
over a File
is basically just a straight call to read(2)
. 在
File
上的fill_buf
上的BufReader
基本上只是对read(2)
的直接调用 。
That said, BufReader
isn't really a useful abstraction when you use it like that; 也就是说,当你像这样使用时,
BufReader
并不是真正有用的抽象; it would probably be less awkward to just call file.read(&mut buf)
directly. 直接调用
file.read(&mut buf)
可能不那么尴尬。
I did it this way, I don't know if it is wrong but it worked perfectly for me, still don't know if it is the correct way tho..我这样做了,我不知道它是否错了,但它对我来说非常有效,仍然不知道它是否是正确的方法......
use std::io;
use std::io::prelude::*;
use std::fs::File;
fn main() -> io::Result<()>
{
const FNAME: &str = "LargeFile.txt";
const CHUNK_SIZE: usize = 1024; // bytes read by every loop iteration.
let mut limit: usize = (1024 * 1024) * 15; // How much should be actually read from the file..
let mut f = File::open(FNAME)?;
let mut buffer = [0; CHUNK_SIZE]; // buffer to contain the bytes.
// read up to 15mb as the limit suggests..
loop {
if limit > 0 {
// Not finished reading, you can parse or process data.
let _n = f.read(&mut buffer[..])?;
for bytes_index in 0..buffer.len() {
print!("{}", buffer[bytes_index] as char);
}
limit -= CHUNK_SIZE;
} else {
// Finished reading..
break;
}
}
Ok(())
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.