The question below is in Java
Sample data : https://tartarus.org/martin/PorterStemmer/output.txt
I have a tokenizationString String array that contain words that similar to the list above with many duplicated words.
I have to conver that string array into a hashmap and then use the hashmap to count the number of times each word is used (count the duplicated value in the string array but i have to use hashmap related method) .
I am thinking of doing in this way
Map<Integer, String> hashMap = new HashMap<Integer, String>();
for(int i = 0 ; i < tokenizationString.length; i++)
{
hashMap.put(i, tokenizationString[i]);
}
After that I will have to sort the string array by # of time they are used.
In the end I want to be able to print out the result like:
the "was used" 502 "times"
i "was used" 50342 "times"
apple "was used" 50 "times"
Firstly, your map should be like Map<String, Integer>
(string and its frequency). I am giving you the Java 8 stream solution.
public static void main(String[] args) {
try (Stream<String> lines = Files.lines(Paths.get("out.txt"))) {
Map<String, Long> frequency = lines
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream()
.sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue,
(o, n) -> o,
LinkedHashMap::new
));
} catch (IOException e) {
e.printStackTrace();
}
}
Above code will read from file line by line. Then collect as a frequency map. Then again convert them into stream of entrySet
. Then sort the stream based on the value in reverse order. Lastly collect them as a LinkedHashMap . LinkedHashMap
because it will maintain the insersion order. Take look at Java 8 Stream API.
Instead of
hashMap.put(i, tokenizationString[i]);
first check if the word is already present, and then increment the corresponding entry:
int count = hashMap.containsKey(tokenizationString[i]) ? hashMap.get(tokenizationString[i]) : 0;
hashMap.put(tokenizationString[i], count + 1);
you can achieve this by Google Gauva library 's MultiMap class as below. Also find the working example at this link - https://gist.github.com/dkalawadia/8d06fba1c2c87dd94ab3e803dff619b0
FileInputStream fstream = null;
BufferedReader br = null;
try {
fstream = new FileInputStream("C:\\temp\\output.txt");
br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
Multimap<String, String> multimap = ArrayListMultimap.create();
// Read File Line By Line
while ((strLine = br.readLine()) != null) {
multimap.put(strLine, strLine);
}
for (String key : multimap.keySet()) {
System.out.println(key + "was used " + multimap.get(key).size() + "times");
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (fstream != null) {
fstream.close();
}
if(br!=null){
br.close();
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.