HashMap iteration/Deletion

Question

I'm trying to process a large amount of data and I'm a bit stuck on the best way to process the final calculation.

I have a HashMap. Each Book object has a data value called COUNT that holds how many times that book appears in my particular context. I want to iterate through the entire HashMap and do record the top ten most-appearing books in an array. At the same time, I also want to remove those top ten books from the HashMap. What is the best way to do this?

Answer 1

I would copy the map into a SortedMap, such as TreeMap, using a comparator that compares the count.

The rest should be obvious.

Answer 2

There is a tournament algorithm that runs in O(n) time and can be useful for large data ,

Optimal algorithm for returning top k values from an array of length N

If the data is not very huge then I would recommend using Collections.sort and creating a subList from your Map.

Another option is it to keep them in TreeMap and implement Comparable in your Book Object , that way your Map is always sorted . This is particularly useful if you are doing additions to your Map as you don't want to sort them every time you change an object.

Answer 3

Yes, you can't remove using a for loop because like this

for(Book curBook: yourMap.values())

You will get a ConcurrentModificationException . To remove elements while iterating, you have to use an iterator, for example:

HashMap<Book> yourMap;

Collection<Book> entries = yourMap.values();
Iterator<Book> iterator = entries.iterator();
while(iterator.hasNext()) {
    Book curBook = iterator.next();
    if (yourConditionToRemove) {
        iterator.remove();
    }
}

If this is a frequent operation, consider using TreeMap as suggested by Bohemian or at least keep a separate Map with most read Books.

Answer 4

I am not that proficient at Java, but I can think about the following algorithm. Assuming that the HashMap stores books according to their unique identifier (ie it gives you no ordering hints about COUNT ). You can:

Define a sequence with capacity for ten books in which they will be stored ordered by COUNT . For clarity, I will call this sequence O10S (Ordered 10-element sequence)
Traverse your hashmap. For each element e in HashMap :
- If O10S is not full yet insert e in O10S
- Otherwise, if e has a COUNT higher than the element o in O10S with the minimum COUNT (which should be easily identifiable since O10S is ordered): remove o from O10S , insert e in O10S
For every o in O10S , remove o from HashMap

The algorithm is linear with respect of the elements in HashMap (you only need to traverse the HashMap once)

HashMap iteration/Deletion

Question

4 answers

solution1
0 2013-03-03 00:56:55

solution2
0 2013-03-03 00:57:05

solution3
0 ACCPTED 2013-03-03 00:57:26

solution4
0 2013-03-03 01:01:53

HashMap iteration/Deletion

Question

4 answers

solution1 0 2013-03-03 00:56:55

solution2 0 2013-03-03 00:57:05

solution3 0 ACCPTED 2013-03-03 00:57:26

solution4 0 2013-03-03 01:01:53

solution1
0 2013-03-03 00:56:55

solution2
0 2013-03-03 00:57:05

solution3
0 ACCPTED 2013-03-03 00:57:26

solution4
0 2013-03-03 01:01:53