简体   繁体   中英

HashMap iteration/Deletion

I'm trying to process a large amount of data and I'm a bit stuck on the best way to process the final calculation.

I have a HashMap. Each Book object has a data value called COUNT that holds how many times that book appears in my particular context. I want to iterate through the entire HashMap and do record the top ten most-appearing books in an array. At the same time, I also want to remove those top ten books from the HashMap. What is the best way to do this?

I would copy the map into a SortedMap, such as TreeMap, using a comparator that compares the count.

The rest should be obvious.

There is a tournament algorithm that runs in O(n) time and can be useful for large data ,

Optimal algorithm for returning top k values from an array of length N

If the data is not very huge then I would recommend using Collections.sort and creating a subList from your Map.

Another option is it to keep them in TreeMap and implement Comparable in your Book Object , that way your Map is always sorted . This is particularly useful if you are doing additions to your Map as you don't want to sort them every time you change an object.

Yes, you can't remove using a for loop because like this

for(Book curBook: yourMap.values())

You will get a ConcurrentModificationException . To remove elements while iterating, you have to use an iterator, for example:

HashMap<Book> yourMap;

Collection<Book> entries = yourMap.values();
Iterator<Book> iterator = entries.iterator();
while(iterator.hasNext()) {
    Book curBook = iterator.next();
    if (yourConditionToRemove) {
        iterator.remove();
    }
}

If this is a frequent operation, consider using TreeMap as suggested by Bohemian or at least keep a separate Map with most read Books.

I am not that proficient at Java, but I can think about the following algorithm. Assuming that the HashMap stores books according to their unique identifier (ie it gives you no ordering hints about COUNT ). You can:

  1. Define a sequence with capacity for ten books in which they will be stored ordered by COUNT . For clarity, I will call this sequence O10S (Ordered 10-element sequence)
  2. Traverse your hashmap. For each element e in HashMap :
    • If O10S is not full yet insert e in O10S
    • Otherwise, if e has a COUNT higher than the element o in O10S with the minimum COUNT (which should be easily identifiable since O10S is ordered): remove o from O10S , insert e in O10S
  3. For every o in O10S , remove o from HashMap

The algorithm is linear with respect of the elements in HashMap (you only need to traverse the HashMap once)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM