简体   繁体   中英

Iterating over lines in a file and looking for substring from a vec! in rust

I'm writing a project in which a struct System can be constructed from a data file. In the data file, some lines contain keywords that indicates values to be read either inside the line or in the subsequent N following lines (separated with a blank line from the line).

I would like to have a vec! containing the keywords (statically known at compile time), check if the line returned by the iterator contains the keyword and do the appropriate operations.

Now my code looks like this:

impl System {
    fn read_data<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>> where P: AsRef<Path> {
        let file = File::open(filename)?;
        let f = BufReader::new(file);
        Ok(f.lines())
    }
    ...
    pub fn new_from_data<P>(dataname: P) -> System where P: AsRef<Path> {
        let keywd = vec!["atoms", "atom types".into(),
                         "Atoms".into()];
        let mut sys = System::new();
        if let Ok(mut lines) = System::read_data(dataname) {
            while let Some(line) = lines.next() {
                for k in keywd {
                    let split: Vec<&str> = line.unwrap().split(" ").collect();
                    if split.contains(k) {
                        match k {
                        "atoms" => sys.natoms = split[0].parse().unwrap(),
                        "atom types" => sys.ntypes = split[0].parse().unwrap(),
                        "Atoms" => {
                            lines.next();
                            // assumes fields are: atom-ID molecule-ID atom-type q x y z
                            for _ in 1..=sys.natoms {
                                let atline = lines.next().unwrap().unwrap();
                                let data: Vec<&str> = atline.split(" ").collect();
                                let atid: i32 = data[0].parse().unwrap();
                                let molid: i32 = data[1].parse().unwrap();
                                let atype: i32 = data[2].parse().unwrap();
                                let charge: f32 = data[3].parse().unwrap();
                                let x: f32 = data[4].parse().unwrap();
                                let y: f32 = data[5].parse().unwrap();
                                let z: f32 = data[6].parse().unwrap();
                                let at = Atom::new(atid, molid, atype, charge, x, y, z);
                                sys.atoms.push(at);
                            };
                        },
                        _ => (),
                        }
                    }
                }
            }
        }
        sys
    }
}

I'm very unsure on two points:

  1. I don't know if I treated the line by line reading of the file in an idiomatic way as I tinkered some examples from the book and Rust by example. But returning an iterator makes me wonder when and how unwrap the results. For example, when calling the iterator inside the while loop do I have to unwrap twice like in let atline = lines.next().unwrap().unwrap(); ? I think that the compiler does not complain yet because of the 1st error it encounters which is
  2. I cannot wrap my head around the type the give to the value k as I get a typical:
error[E0308]: mismatched types
 --> src/system/system.rs:65:39
  |
65 |                     if split.contains(k) {
  |                                       ^ expected `&str`, found `str`
  |
  = note: expected reference `&&str`
             found reference `&str`

error: aborting due to previous error

How are we supposed to declare the substring and compare it to the strings I put in keywd ? I tried to deference k in contains, tell it to look at &keywd etc but I just feel I'm wasting my time for not properly adressing the problem. Thanks in advance, any help is indeed appreciated.

Let's go through the issues one by one. I'll go through the as they appear in the code.

First you need to borrow keywd in the for loop, ie &keywd . Because otherwise keywd gets moved after the first iteration of the while loop, and thus why the compiler complains about that.

for k in &keywd {
    let split: Vec<&str> = line.unwrap().split(" ").collect();

Next, when you call .unwrap() on line , that's the same problem. That causes the inner Ok value to get moved out of the Result . Instead you can do line.as_ref().unwrap() as then you get a reference to the inner Ok value and aren't consuming the line Result.

Alternatively, you can .filter_map(Result::ok) on your lines , to avoid ( .as_ref() ) .unwrap() altogether.

You can add that directly to read_data and even simply the return type using impl... .

fn read_data<P>(filename: P) -> io::Result<impl Iterator<Item = String>>
where
    P: AsRef<Path>,
{
    let file = File::open(filename)?;
    let f = BufReader::new(file);
    Ok(f.lines().filter_map(Result::ok))
}

Note that you're splitting line for every keywd , which is needless. So you can move that outside of your for loop as well.

All in all, it ends up looking like this:

if let Ok(mut lines) = read_data("test.txt") {
    while let Some(line) = lines.next() {
        let split: Vec<&str> = line.split(" ").collect();
        for k in &keywd {
            if split.contains(k) {
                ...

Given that we borrowed &keywd , then we don't need to change k to &k , as now k is already &&str .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM