简体   繁体   English

遍历文件中的行并从 vec 中查找 substring! 在 rust

[英]Iterating over lines in a file and looking for substring from a vec! in rust

I'm writing a project in which a struct System can be constructed from a data file.我正在编写一个项目,其中可以从数据文件构建结构System In the data file, some lines contain keywords that indicates values to be read either inside the line or in the subsequent N following lines (separated with a blank line from the line).在数据文件中,一些行包含关键字,这些关键字指示要在行内或后续 N 行中读取的值(与行之间用空行分隔)。

I would like to have a vec!我想要一个vec! containing the keywords (statically known at compile time), check if the line returned by the iterator contains the keyword and do the appropriate operations.包含关键字(在编译时静态已知),检查迭代器返回的行是否包含关键字并执行适当的操作。

Now my code looks like this:现在我的代码看起来像这样:

impl System {
    fn read_data<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>> where P: AsRef<Path> {
        let file = File::open(filename)?;
        let f = BufReader::new(file);
        Ok(f.lines())
    }
    ...
    pub fn new_from_data<P>(dataname: P) -> System where P: AsRef<Path> {
        let keywd = vec!["atoms", "atom types".into(),
                         "Atoms".into()];
        let mut sys = System::new();
        if let Ok(mut lines) = System::read_data(dataname) {
            while let Some(line) = lines.next() {
                for k in keywd {
                    let split: Vec<&str> = line.unwrap().split(" ").collect();
                    if split.contains(k) {
                        match k {
                        "atoms" => sys.natoms = split[0].parse().unwrap(),
                        "atom types" => sys.ntypes = split[0].parse().unwrap(),
                        "Atoms" => {
                            lines.next();
                            // assumes fields are: atom-ID molecule-ID atom-type q x y z
                            for _ in 1..=sys.natoms {
                                let atline = lines.next().unwrap().unwrap();
                                let data: Vec<&str> = atline.split(" ").collect();
                                let atid: i32 = data[0].parse().unwrap();
                                let molid: i32 = data[1].parse().unwrap();
                                let atype: i32 = data[2].parse().unwrap();
                                let charge: f32 = data[3].parse().unwrap();
                                let x: f32 = data[4].parse().unwrap();
                                let y: f32 = data[5].parse().unwrap();
                                let z: f32 = data[6].parse().unwrap();
                                let at = Atom::new(atid, molid, atype, charge, x, y, z);
                                sys.atoms.push(at);
                            };
                        },
                        _ => (),
                        }
                    }
                }
            }
        }
        sys
    }
}

I'm very unsure on two points:我对两点非常不确定:

  1. I don't know if I treated the line by line reading of the file in an idiomatic way as I tinkered some examples from the book and Rust by example.我不知道我是否以惯用的方式逐行读取文件,因为我修改了书中的一些示例,并举例说明了 Rust。 But returning an iterator makes me wonder when and how unwrap the results.但是返回一个迭代器让我想知道何时以及如何解包结果。 For example, when calling the iterator inside the while loop do I have to unwrap twice like in let atline = lines.next().unwrap().unwrap();例如,当在 while 循环中调用迭代器时,我是否必须像let atline = lines.next().unwrap().unwrap();那样解包两次? ? I think that the compiler does not complain yet because of the 1st error it encounters which is我认为编译器还没有抱怨,因为它遇到的第一个错误是
  2. I cannot wrap my head around the type the give to the value k as I get a typical:当我得到一个典型值时,我无法围绕赋予值 k 的类型进行思考:
error[E0308]: mismatched types
 --> src/system/system.rs:65:39
  |
65 |                     if split.contains(k) {
  |                                       ^ expected `&str`, found `str`
  |
  = note: expected reference `&&str`
             found reference `&str`

error: aborting due to previous error

How are we supposed to declare the substring and compare it to the strings I put in keywd ?我们应该如何声明 substring 并将其与我放入keywd的字符串进行比较? I tried to deference k in contains, tell it to look at &keywd etc but I just feel I'm wasting my time for not properly adressing the problem.我试图在 contains 中引用 k,告诉它查看 &keywd 等,但我只是觉得我在浪费时间,因为我没有正确解决问题。 Thanks in advance, any help is indeed appreciated.提前致谢,非常感谢您的帮助。

Let's go through the issues one by one.让我们 go 一一道来。 I'll go through the as they appear in the code.我将 go 通过它们出现在代码中。

First you need to borrow keywd in the for loop, ie &keywd .首先需要在for循环中借用keywd ,即&keywd Because otherwise keywd gets moved after the first iteration of the while loop, and thus why the compiler complains about that.因为否则keywdwhile循环的第一次迭代之后被移动,因此编译器为什么会抱怨这个。

for k in &keywd {
    let split: Vec<&str> = line.unwrap().split(" ").collect();

Next, when you call .unwrap() on line , that's the same problem.接下来,当您line调用.unwrap()时,这是同样的问题。 That causes the inner Ok value to get moved out of the Result .这导致内部Ok值从Result中移出。 Instead you can do line.as_ref().unwrap() as then you get a reference to the inner Ok value and aren't consuming the line Result.相反,您可以执行line.as_ref().unwrap() ,因为这样您就可以获得对内部Ok值的引用,并且不会使用结果line

Alternatively, you can .filter_map(Result::ok) on your lines , to avoid ( .as_ref() ) .unwrap() altogether.或者,您可以.filter_map(Result::ok)在您的lines上,以避免( .as_ref().unwrap()完全。

You can add that directly to read_data and even simply the return type using impl... .您可以将其直接添加到read_data中,甚至可以使用impl...将其简单地添加到返回类型中。

fn read_data<P>(filename: P) -> io::Result<impl Iterator<Item = String>>
where
    P: AsRef<Path>,
{
    let file = File::open(filename)?;
    let f = BufReader::new(file);
    Ok(f.lines().filter_map(Result::ok))
}

Note that you're splitting line for every keywd , which is needless.请注意,您正在为每个keywd拆分line ,这是不必要的。 So you can move that outside of your for loop as well.所以你也可以把它移到你的for循环之外。

All in all, it ends up looking like this:总而言之,它最终看起来像这样:

if let Ok(mut lines) = read_data("test.txt") {
    while let Some(line) = lines.next() {
        let split: Vec<&str> = line.split(" ").collect();
        for k in &keywd {
            if split.contains(k) {
                ...

Given that we borrowed &keywd , then we don't need to change k to &k , as now k is already &&str .鉴于我们借用了&keywd ,那么我们不需要将k更改为&k ,因为现在k已经是&&str

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM