This program is designed to check the number of occurrences of each word in a string. Every test ran successfully except when words = "Joe can't tell between 'large' and large."
or words = "First: don't laugh. Then: don't cry."
. If I get rid of .c.is_alphanumeric()
in split closure, then I would have to write every single special character on which the words have to be split.
This a beginner level exercise on Exercism so I wanted to avoid regex crate.
use std::collections::HashMap;
pub fn word_count(words: &str) -> HashMap<String, u32> {
let mut indexes: HashMap<String, u32> = HashMap::new();
let to_lowercase = words.to_lowercase();
for c in to_lowercase.split(|c: char| !c.is_alphanumeric()).filter(|&x| x!="").collect::<Vec<&str>>(){
let entry = indexes.entry(c.to_string()).or_insert(0);
*entry += 1;
};
indexes
}
Some tests
fn check_word_count(s: &str, pairs: &[(&str, u32)]) {
// The reason for the awkward code in here is to ensure that the failure
// message for assert_eq! is as informative as possible. A simpler
// solution would simply check the length of the map, and then
// check for the presence and value of each key in the given pairs vector.
let mut m: HashMap<String, u32> = word_count(s);
for &(k, v) in pairs.iter() {
assert_eq!((k, m.remove(&k.to_string()).unwrap_or(0)), (k, v));
}
// may fail with a message that clearly shows all extra pairs in the map
assert_eq!(m.iter().collect::<Vec<(&String, &u32)>>(), vec![]);
}
fn with_apostrophes() {
check_word_count(
"First: don't laugh. Then: don't cry.",
&[
("first", 1),
("don't", 2),
("laugh", 1),
("then", 1),
("cry", 1),
],
);
}
#[test]
#[ignore]
fn with_quotations() {
check_word_count(
"Joe can't tell between 'large' and large.",
&[
("joe", 1),
("can't", 1),
("tell", 1),
("between", 1),
("large", 2),
("and", 1),
],
);
}
I suppose it depends on the definition of "word" from the point of view of the rules. If you simply include the single-quote '
as one of the characters that will not cause a word split, then you will include
The following code prevents a split on a single-quote:
let single_quote: char = '\'';
....
split( |c: char| !c.is_alphanumeric() && c != single_quote)
This will see 'large'
as a word distinct from large
, which might not be what you want, but again, the rules are not clear.
And, here is my full program.
use std::collections::HashMap;
pub fn word_count(words: &str) -> HashMap<String, u32> {
let mut indexes: HashMap<String, u32> = HashMap::new();
let to_lowercase = words.to_lowercase();
let single_quote: char = '\'';
for c in to_lowercase.split
( |c: char| !c.is_alphanumeric() && c != single_quote)
.filter(|x| !x.is_empty())
.collect::<Vec<&str>>(){
let entry = indexes.entry(c.to_string()).or_insert(0);
*entry += 1;
};
indexes
}
fn main(){
let phrase = "Joe can't tell between 'large' and large.";
let indices = word_count(phrase);
println!("Phrase: {}", phrase);
for (word,index) in indices {
println!("word: {}, count: {}", word, index);
}
}
And, here is the output from my main() routine.
Phrase: Joe can't tell between 'large' and large.
word: joe, count: 1
word: can't, count: 1
word: 'large', count: 1
word: and, count: 1
word: between, count: 1
word: tell, count: 1
word: large, count: 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.