简体   繁体   English

在HashMap中获取最常用键的有效方法 - Java

[英]Efficient way to get the most used keys in a HashMap - Java

I have a HashMap where the key is a word and the value is a number of occurrences of that string in a text. 我有一个HashMap,其中键是一个单词,值是文本中该字符串出现的次数。 Now I'd like to reduce this HashMap to only 15 most used words (with greatest numbers of occurrences). 现在我想将这个HashMap减少到只有15个最常用的单词(出现次数最多)。 Do you have any idea to do this efficiently? 你有任何想法有效地做到这一点吗?

Using an array instead of ArrayList as suggested by Pindatjuh could be better, Pindatjuh建议使用数组而不是ArrayList可能会更好,

public class HashTest {
        public static void main(String[] args) {
            class hmComp implements Comparator<Map.Entry<String,Integer>> {
                public int compare(Entry<String, Integer> o1,
                        Entry<String, Integer> o2) {
                    return o2.getValue() - o1.getValue();
                }
            }
            HashMap<String, Integer> hm = new HashMap<String, Integer>();
            Random rand = new Random();
            for (int i = 0; i < 26; i++) {
                hm.put("Word" +i, rand.nextInt(100));
            }
            ArrayList list = new ArrayList( hm.entrySet() );
            Collections.sort(list, new hmComp() );
            for ( int i = 0  ; i < 15 ; i++ ) {
                System.out.println( list.get(i) );
            }

        }
    }

EDIT reversed sorting order 编辑反向排序顺序

One way I think of to tackle this, but it's probably not the most efficient, is: 我想解决这个问题的一种方法,但它可能不是有效的方法,是:

  • Create an array of hashMap.entrySet().toArray(new Entry[]{}) . 创建一个hashMap.entrySet().toArray(new Entry[]{})数组。
  • Sort this using Arrays.sort , create your own Comparator which will compare only on Entry.getValue() (which casts it to an Integer). 使用Arrays.sort对此进行Arrays.sort ,创建自己的Comparator ,它将仅在Entry.getValue() (将其转换为Integer)上进行比较。 Make it order descending, ie most/highest first, less/lowest latest. 使其顺序降序,即最高/最高,最低/最低。
  • Iterate over the sorted array and break when you've reached the 15th value. 迭代排序的数组并在达到第15个值时中断。
Map<String, Integer> map = new HashMap<String, Integer>();

    // --- Put entries into map here ---

    // Get a list of the entries in the map
    List<Map.Entry<String, Integer>> list = new Vector<Map.Entry<String, Integer>>(map.entrySet());

    // Sort the list using an annonymous inner class implementing Comparator for the compare method
    java.util.Collections.sort(list, new Comparator<Map.Entry<String, Integer>>(){
        public int compare(Map.Entry<String, Integer> entry, Map.Entry<String, Integer> entry1)
        {
            // Return 0 for a match, -1 for less than and +1 for more then
            return (entry.getValue().equals(entry1.getValue()) ? 0 : (entry.getValue() > entry1.getValue() ? 1 : -1));
        }
    });

    // Clear the map
    map.clear();

    // Copy back the entries now in order
    for (Map.Entry<String, Integer> entry: list)
    {
        map.put(entry.getKey(), entry.getValue());
    }

Use first 15 entries of map. 使用前15个地图条目。 Or modify last 4 lines to put only 15 entries into map 或者修改最后4行,只将15个条目放入地图中

您可以使用LinkedHashMap并删除最近最少使用的项目。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM