简体   繁体   English

HashMap迭代/删除

[英]HashMap iteration/Deletion

I'm trying to process a large amount of data and I'm a bit stuck on the best way to process the final calculation. 我正在尝试处理大量数据,但是我对处理最终计算的最佳方法有些困惑。

I have a HashMap. 我有一个HashMap。 Each Book object has a data value called COUNT that holds how many times that book appears in my particular context. 每个Book对象都有一个称为COUNT的数据值,该值保存该书在我的特定上下文中出现的次数。 I want to iterate through the entire HashMap and do record the top ten most-appearing books in an array. 我想遍历整个HashMap,并在数组中记录前十名最出现的书籍。 At the same time, I also want to remove those top ten books from the HashMap. 同时,我还想从HashMap中删除前十本书。 What is the best way to do this? 做这个的最好方式是什么?

I would copy the map into a SortedMap, such as TreeMap, using a comparator that compares the count. 我将使用比较计数的比较器将地图复制到SortedMap(例如TreeMap)中。

The rest should be obvious. 其余的应该是显而易见的。

There is a tournament algorithm that runs in O(n) time and can be useful for large data , 有一种锦标赛算法可以在O(n)时间运行,对大数据有用,

Optimal algorithm for returning top k values from an array of length N 从长度为N的数组中返回前k个值的最佳算法

If the data is not very huge then I would recommend using Collections.sort and creating a subList from your Map. 如果数据不是很大,那么我建议您使用Collections.sort并从您的Map中创建一个subList。

Another option is it to keep them in TreeMap and implement Comparable in your Book Object , that way your Map is always sorted . 另一个选择是将它们保留在TreeMap中,并在Book对象中实现Comparable,这样您的Map就会始终被排序。 This is particularly useful if you are doing additions to your Map as you don't want to sort them every time you change an object. 如果您要对地图进行添加,这是特别有用的,因为您不想在每次更改对象时对它们进行排序。

Yes, you can't remove using a for loop because like this 是的,您无法使用for循环删除for因为这样

for(Book curBook: yourMap.values())

You will get a ConcurrentModificationException . 您将获得ConcurrentModificationException To remove elements while iterating, you have to use an iterator, for example: 要在迭代时删除元素,您必须使用迭代器,例如:

HashMap<Book> yourMap;

Collection<Book> entries = yourMap.values();
Iterator<Book> iterator = entries.iterator();
while(iterator.hasNext()) {
    Book curBook = iterator.next();
    if (yourConditionToRemove) {
        iterator.remove();
    }
}

If this is a frequent operation, consider using TreeMap as suggested by Bohemian or at least keep a separate Map with most read Books. 如果这是经常性的操作,请考虑按照Bohemian的建议使用TreeMap,或者至少将阅读最多的Book保留在单独的Map中。

I am not that proficient at Java, but I can think about the following algorithm. 我不是Java熟练者,但是我可以考虑以下算法。 Assuming that the HashMap stores books according to their unique identifier (ie it gives you no ordering hints about COUNT ). 假设HashMap根据书籍的唯一标识符存储书籍(即,它不提供有关COUNT排序提示)。 You can: 您可以:

  1. Define a sequence with capacity for ten books in which they will be stored ordered by COUNT . 定义一个容量为10本书的序列,按COUNT顺序将其存储在其中。 For clarity, I will call this sequence O10S (Ordered 10-element sequence) 为了清楚起见,我将此序列O10S (有序10元素序列)
  2. Traverse your hashmap. 遍历您的哈希图。 For each element e in HashMap : 对于HashMap每个元素e
    • If O10S is not full yet insert e in O10S 如果O10S尚未满,请在O10S插入e
    • Otherwise, if e has a COUNT higher than the element o in O10S with the minimum COUNT (which should be easily identifiable since O10S is ordered): remove o from O10S , insert e in O10S 否则,如果e具有COUNT比元件更高oO10S具有最小COUNT (这应该是容易识别的,因为O10S被命令):除去oO10S ,插入eO10S
  3. For every o in O10S , remove o from HashMap 对于O10S每个o ,从HashMap删除o

The algorithm is linear with respect of the elements in HashMap (you only need to traverse the HashMap once) 该算法相对于HashMap中的元素是线性的(您只需要遍历HashMap一次)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM