简体   繁体   中英

How to append to string values in a hash table in Rust?

I have source files that contain text CSV lines for many products for a given day. I want to use Rust to collate these files so that I end up with many new destination CSV files, one per product, each containing portions of the lines only specific to that product.

My current solution is to loop over the lines of the source files and use a HashMap<String, String> to gather the lines for each product. I split each source line and use the element containing the product ID as a key, to obtain an Entry (occupied or vacant) in my HashMap . If it is vacant, I initialize the value with a new String that is allocated up-front with a given capacity, so that I can efficiently append to it thereafter.

// so far, so good (the first CSV item is the product ID)
let mystringval = productmap.entry(splitsource[0].to_owned()).or_insert(String::with_capacity(SOME_CAPACITY));

I then want to append formatted elements of the same source line to this Entry . There are many examples online, such as
https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.entry
of how to make this work if the HashMap value is an integer:

// this works if you obtain an Entry from a HashMap containing int vals
*myval += 1;

I haven't figured out how to append more text to the Entry I obtain from my HashMap<String, String> using this kind of syntax, and I've done my best to research examples online. There are surprisingly few examples anywhere of manipulating non-numeric entries in Rust data structures.

// using the Entry obtained from my first code snippet above
*mystringval.push_str(sourcePortion.as_str());

Attempting to compile this produces the following error:

error: type `()` cannot be dereferenced
   --> coll.rs:102:17
    |
102 |                 *mystringval.push_str(sourcePortion.as_str());
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

How can I append to a String inside the Entry value?

If you inspect the type returned by or_insert :

fn update_count(map: &mut HashMap<&str, u32>) {
    let () = map.entry("hello").or_insert(0);
}

You will see it is a mutable reference:

error[E0308]: mismatched types
 --> src/main.rs:4:9
  |
4 |     let () = map.entry("hello").or_insert(0);
  |         ^^ expected &mut u32, found ()
  |
  = note: expected type `&mut u32`
             found type `()`

That means that you can call any method that needs a &mut self receiver with no extra syntax:

fn update_mapping(map: &mut HashMap<&str, String>) {
    map.entry("hello").or_insert_with(String::new).push_str("wow")
}

Turning back to the integer form, what happens if we don't put the dereference?

fn update_count(map: &mut HashMap<&str, i32>) {
    map.entry("hello").or_insert(0) += 1;
}
error[E0368]: binary assignment operation `+=` cannot be applied to type `&mut i32`
 --> src/main.rs:4:5
  |
4 |     map.entry("hello").or_insert(0) += 1;
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot use `+=` on type `&mut i32`

error[E0067]: invalid left-hand side expression
 --> src/main.rs:4:5
  |
4 |     map.entry("hello").or_insert(0) += 1;
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ invalid expression for left-hand side

The difference is that the += operator automatically takes a mutable reference to the left-hand side of the expression. Expanded, it might look something like this:

use std::ops::AddAssign;

fn update_count(map: &mut HashMap<&str, i32>) {
    AddAssign::add_assign(&mut map.entry("hello").or_insert(0), 1);
}

Adding the explicit dereference brings the types back to one that has the trait implemented:

use std::ops::AddAssign;

fn update_count(map: &mut HashMap<&str, i32>) {
    AddAssign::add_assign(&mut (*map.entry("hello").or_insert(0)), 1);
}

*mystringval.push_str(sourcePortion.as_str()); is parsed as *(mystringval.push_str(sourcePortion.as_str())); and since String::push_str returns () , you get the () cannot be dereferenced error.

Using parentheses around the dereference solves the precedence issue:

(*mystringval).push_str(sourcePortion.as_str());

The reason *myval += 1 works is because unary * has a higher precedence than += , which means it's parsed as

(*myval) += 1

Since or_insert returns &mut V , you don't need to dereference it before calling its methods. The following also works:

mystringval.push_str(sourcePortion.as_str());

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM