[英]Sorting words in text-file by frequency
I've created a program which calculates the amount of distinct words in a text document, and then prints them out to the console.我创建了一个程序来计算文本文档中不同单词的数量,然后将它们打印到控制台。 What I want to add to the program is that the words should be sorted by the highest frequency.
我想添加到程序中的是单词应该按最高频率排序。 I'm sure this is nothing major to add, but I'm clueless on what to do.
我确定这没什么要补充的,但我不知道该怎么做。
This is the code:这是代码:
public static void main(String[] args) throws FileNotFoundException, IOException { public static void main(String[] args) 抛出 FileNotFoundException,IOException {
Map<String, Integer> fileReaderMap = new HashMap<>();
try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
String[] words = line.split(" ");
for (int i = 0; i < words.length; i++) {
if (!fileReaderMap.containsKey(words[i])) {
fileReaderMap.put(words[i], 1);
} else {
int newValue = fileReaderMap.get(words[i]) + 1;
fileReaderMap.put(words[i], newValue);
}
}
sb.append(System.lineSeparator());
line = br.readLine();
}
}
First use a Map to count the appearances of the words.首先使用 Map 来统计单词的出现次数。 And then put the entries of your Map in a List and sort it using Collections.sort() and a comparator.
然后将 Map 的条目放入 List 并使用 Collections.sort() 和比较器对其进行排序。 Than you can just print the sorted List:
比你可以只打印排序的列表:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
public class Snippet {
public static void main(String[] args) throws FileNotFoundException, IOException {
Map<String, Integer> fileReaderMap = new HashMap<>();
try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
String[] words = line.split(" ");
for (int i = 0; i < words.length; i++) {
if (!fileReaderMap.containsKey(words[i])) {
fileReaderMap.put(words[i], 1);
} else {
int newValue = fileReaderMap.get(words[i]) + 1;
fileReaderMap.put(words[i], newValue);
}
}
sb.append(System.lineSeparator());
line = br.readLine();
}
}
List<Entry<String, Integer>> sorted = new ArrayList<>(fileReaderMap.entrySet());
Collections.sort(sorted, new Comparator<Entry<String, Integer>>() {
@Override
public int compare(Entry<String, Integer> o1, Entry<String, Integer> o2) {
int comp = Integer.compare(o1.getValue(), o2.getValue());
if (comp != 0) {
return comp;
}
return o1.getKey().compareTo(o2.getKey());
}
});
for (Entry<String, Integer> entry : sorted) {
System.out.println("Ord: " + entry.getKey() + "\t Antal Gånger: " + entry.getValue());
}
}
}
TreeMap
sorts on the basis of key. TreeMap
根据键进行排序。 What you want to do is sort on the basis of value.您要做的是根据价值进行排序。 You can create a
class
say Frequency, having fields as String word
and int freq
.您可以创建一个
class
say Frequency,将字段作为String word
和int freq
。 Then you can iterate over hashmap and store these keys and values in their respective fields of this class.然后,您可以遍历 hashmap 并将这些键和值存储在此类的各自字段中。 By traversing you can create a ArrayList of Frequency and using
Comparable/Comparator
you can sort on the basis of 'freq`.通过遍历,您可以创建频率的 ArrayList,并使用
Comparable/Comparator
可以根据“freq”进行排序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.