简体   繁体   English

从 hashMap 中取出 10 个具有最高值的字符串

[英]Taking 10 Strings with highest values from hashMap

I want to save all words from titles from a site to a file.我想将标题中的所有单词从站点保存到文件中。 Then I want to take 10 most frequent words and save them to the other file.然后我想取 10 个最常用的单词并将它们保存到另一个文件中。 So I've got saving to the file.所以我已经保存到文件中。 But I've stucked on looking for those 10 words.但我一直在寻找这 10 个词。 My code is only looking for 1 most frequent word and that's it.我的代码只查找 1 个最常用的单词,仅此而已。 There're for sure better ways to do that than the one I've done.肯定有比我做过的更好的方法来做到这一点。 I'd be really grateful if you show me some tips.如果你能给我一些提示,我将不胜感激。 I've made through the most popular topics here, but all of them are about looking for the one most frequent word.我已经浏览了这里最流行的话题,但所有这些话题都是关于寻找一个最常用的词。

List<String> mostRepeatedWords = new ArrayList<>();
Set<Map.Entry<String, Integer>> entrySet = wordsMap.entrySet();
int max = 0;
for (int i = 0; i < entrySet.size(); i++) {
    for (Map.Entry<String, Integer> entry : entrySet) {   //here I'm looking for the word with the highest value in the map
        if (entry.getValue() > max) {
            max = entry.getValue();
            }
     }
     for (Object o : wordsMap.keySet()) {     //here I write this word to a list
         if (wordsMap.get(o).equals(max)) {
             mostRepeatedWords.add(o.toString());
         }
    }
}

@Edit Here's how I've counted the words: @Edit 这是我数词的方法:

while (currentLine != null) {
    String[] words = currentLine.toLowerCase().split(" ");

    for (String word : words) {
        if (!wordsMap.containsKey(word) && word.length() > 3) {
            wordsMap.put(word, 1);
        } else if (word.length() > 3) {
            int value = wordsMap.get(word);
            value++;
            wordsMap.replace(word, value);
        }
    }
    currentLine = reader.readLine();
}

Does this do it for you?这对你有用吗?

First, sort the words (ie keys) of the map based on the frequency of occurrence in reverse order.首先,根据出现频率倒序对map的词(即keys)进行排序。

List<String> words = mapOfWords.entrySet().stream()
        .sorted(Entry.comparingByValue(Comparator.reverseOrder()))
        .limit(10)
        .map(Entry::getKey)
        .collect(Collectors.toList());

Then use those keys to print the first 10 words in decreasing frequency.然后使用这些键以递减的频率打印前 10 个单词。

for (String word : words) {
    System.out.println(word + " " + mapOfWords.get(word));
}

Another more traditional approach not using streams is the following:另一种不使用流的更传统的方法如下:

Test data测试数据

Map<String, Integer> mapOfWords =
        Map.of("A", 10, "B", 3, "C", 8, "D", 9);

Create a list of map entries创建地图条目列表

List<Entry<String, Integer>> mapEntries =
        new ArrayList<>(mapOfWords.entrySet());

define a Comparator to sort the entries based on the frequency定义一个Comparator以根据频率对条目进行排序

Comparator<Entry<String, Integer>> comp = new Comparator<>() {
    @Override
    public int compare(Entry<String, Integer> e1,
            Entry<String, Integer> e2) {
            Objects.requireNonNull(e1);
            Objects.requireNonNull(e2);
        // notice e2 and e1 order is reversed to sort in descending order.
        return Integer.compare(e2.getValue(), e1.getValue());
    }
};

The above does the equivalent of the following which is defined in the Map.Entry class以上相当于Map.Entry class定义的以下内容

Comparator<Entry<String,Integer>> comp =
   Entry.comparingByValue(Comparator.reverseOrder());

Now sort the list with either comparator.现在使用任一比较器对列表进行排序。

mapEntries.sort(comp);

Now just print the list of entries.现在只需打印条目列表。 If there are more than 10 you will need to put in a limiting counter or use a mapEntries.subList(0, 10) as the target of the for loop .如果超过 10 个,您将需要放入一个限制计数器或使用mapEntries.subList(0, 10)作为for loop的目标。

for (Entry<?,?> e : mapEntries) {
     System.out.println(e);
}

You could save the most frequent word to an array and check if the next word you found already exists in that array.您可以将最常用的单词保存到一个数组中,并检查您找到的下一个单词是否已存在于该数组中。 Then you search for the next most frequent word that does not exist in that array.然后搜索该数组中不存在的下一个最频繁的单词。

Assuming you have already your frequency map which might look something like:假设您已经有了可能如下所示的频率图:

Map<String,Integer> wordsMap = Map.of( "foo", 2,
                                       "bar", 7,
                                       "baz", 5,
                                       "doo", 9,
                                       "tot", 2,
                                       "gee", 12);

You could create another map, ie a top ten map (in my demo below top three), by sorting your map by value in reverse order and limit it to the first ten entries您可以创建另一张地图,即前十张地图(在我的演示中前三名下方),方法是按相反顺序按值对地图进行排序并将其限制为前十个条目

Map<String,Integer> topThree = wordsMap.entrySet()
                                       .stream()
                                       .sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
                                       .limit(3)
                                       .collect(Collectors.toMap(
                                          Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e2,LinkedHashMap::new));

System.out.println(topThree);

//{gee=12, doo=9, bar=7}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM