I'm trying to do an exercise where I need to create a class to read the words from a .txt put the words in an HashSet. The thing is, if the text read "I am Daniel, Daniel I am." I'll have a word for "am" , "am." and "Daniel," and "Daniel". How do I fix this?
Here's my code. (I tried to use regex, but I'm getting an exception):
import java.io.File;
import java.io.FileNotFoundException;
import java.util.HashSet;
import java.util.Scanner;
public class WordCount {
public static void main(String[] args) {
try {
File file = new File(args[0]);
HashSet<String> set = readFromFile(file);
set.forEach(word -> System.out.println(word));
}
catch(FileNotFoundException e) {
System.err.println("File Not Found!");
}
}
private static HashSet<String> readFromFile(File file) throws FileNotFoundException {
HashSet<String> set = new HashSet<String>();
Scanner scanner = new Scanner(file);
while(scanner.hasNext()) {
String s = scanner.next("[a-zA-Z]");
set.add(s.toUpperCase());
}
scanner.close();
return set;
}
}
Error is thrown when the Scanner try to read a string not matching with the regex.
String s = scanner.next("[a-zA-Z]");
Instead of passing the regex in the Scanner. Read the word and remove the special characters as shown below.
String s = scanner.next();
s = s.replaceAll("[^a-zA-Z]", "");
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.