简体   繁体   English

按频率对文本文件中的单词进行排序

[英]Sorting words in text-file by frequency

I've created a program which calculates the amount of distinct words in a text document, and then prints them out to the console.我创建了一个程序来计算文本文档中不同单词的数量,然后将它们打印到控制台。 What I want to add to the program is that the words should be sorted by the highest frequency.我想添加到程序中的是单词应该按最高频率排序。 I'm sure this is nothing major to add, but I'm clueless on what to do.我确定这没什么要补充的,但我不知道该怎么做。

This is the code:这是代码:

public static void main(String[] args) throws FileNotFoundException, IOException { public static void main(String[] args) 抛出 FileNotFoundException,IOException {

Map<String, Integer> fileReaderMap = new HashMap<>();

try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
    StringBuilder sb = new StringBuilder();
    String line = br.readLine();

    while (line != null) {
    String[] words = line.split(" ");
    for (int i = 0; i < words.length; i++) {
        if (!fileReaderMap.containsKey(words[i])) {
        fileReaderMap.put(words[i], 1);
        } else {
        int newValue = fileReaderMap.get(words[i]) + 1;
        fileReaderMap.put(words[i], newValue);
        }
    }
    sb.append(System.lineSeparator());
    line = br.readLine();
    }
}

First use a Map to count the appearances of the words.首先使用 Map 来统计单词的出现次数。 And then put the entries of your Map in a List and sort it using Collections.sort() and a comparator.然后将 Map 的条目放入 List 并使用 Collections.sort() 和比较器对其进行排序。 Than you can just print the sorted List:比你可以只打印排序的列表:

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

public class Snippet {

    public static void main(String[] args) throws FileNotFoundException, IOException {

    Map<String, Integer> fileReaderMap = new HashMap<>();

    try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
        StringBuilder sb = new StringBuilder();
        String line = br.readLine();

        while (line != null) {
        String[] words = line.split(" ");
        for (int i = 0; i < words.length; i++) {
            if (!fileReaderMap.containsKey(words[i])) {
            fileReaderMap.put(words[i], 1);
            } else {
            int newValue = fileReaderMap.get(words[i]) + 1;
            fileReaderMap.put(words[i], newValue);
            }
        }
        sb.append(System.lineSeparator());
        line = br.readLine();
        }
    }
    List<Entry<String, Integer>> sorted = new ArrayList<>(fileReaderMap.entrySet());
    Collections.sort(sorted, new Comparator<Entry<String, Integer>>() {
        @Override
        public int compare(Entry<String, Integer> o1, Entry<String, Integer> o2) {
        int comp = Integer.compare(o1.getValue(), o2.getValue());
        if (comp != 0) {
            return comp;
        }
        return o1.getKey().compareTo(o2.getKey());
        }
    });

    for (Entry<String, Integer> entry : sorted) {
        System.out.println("Ord: " + entry.getKey() + "\t Antal Gånger: " + entry.getValue());
    }
    }
}

TreeMap sorts on the basis of key. TreeMap根据键进行排序。 What you want to do is sort on the basis of value.您要做的是根据价值进行排序。 You can create a class say Frequency, having fields as String word and int freq .您可以创建一个class say Frequency,将字段作为String wordint freq Then you can iterate over hashmap and store these keys and values in their respective fields of this class.然后,您可以遍历 hashmap 并将这些键和值存储在此类的各自字段中。 By traversing you can create a ArrayList of Frequency and using Comparable/Comparator you can sort on the basis of 'freq`.通过遍历,您可以创建频率的 ArrayList,并使用Comparable/Comparator可以根据“freq”进行排序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM