简体   繁体   中英

Doesn't that iterate thru entrySet() create too many Map.Entry instances?

I'm not sure if HashMap or TreeMap store Map.Entry in itself. That is, it's likely to return Map.Entry instance created on the fly when entrySet().iterator().next() is called.

Personally, I think it may be better in this form:

class Entry {
    Object key;
    Object value;
}

interface InplaceIterator {
    boolean next();
}

Entry entryBuf = new Entry();
InplaceIterator it = map.entrySet().inplaceIterator(entryBuf);
while (it.next()) {
    // do with entryBuf...
}

Thus, the creation of Entry is avoided.

I don't know how Java Compiler works, will Java Compiler optimize the creation of Map.Entry away, by analyzing the dataflow and get the knowledge that Map.Entry can be safely reused?

Or, is someone there have already written another collection framework to enable inplace iteration?

What you describe (having an iterator-local Map.Entry object and reusing it for all next() return values) is one possible Map implementation, and I think some special-purpose maps are using this.

For example, the implementation of EnumMap.entrySet().iterator() (here the version from OpenJDK, 1.6.0_20) simply uses the iterator object itself as the Entry object returned by the next() method:

/**
 * Since we don't use Entry objects, we use the Iterator itself as entry.
 */
private class EntryIterator extends EnumMapIterator<Map.Entry<K,V>>
    implements Map.Entry<K,V>
{
    public Map.Entry<K,V> next() {
        if (!hasNext())
            throw new NoSuchElementException();
        lastReturnedIndex = index++;
        return this;
    }

    public K getKey() {
        checkLastReturnedIndexForEntryUse();
        return keyUniverse[lastReturnedIndex];
    }

    public V getValue() {
        checkLastReturnedIndexForEntryUse();
        return unmaskNull(vals[lastReturnedIndex]);
    }

    public V setValue(V value) {
        checkLastReturnedIndexForEntryUse();
        V oldValue = unmaskNull(vals[lastReturnedIndex]);
        vals[lastReturnedIndex] = maskNull(value);
        return oldValue;
    }

    // equals, hashCode, toString

    private void checkLastReturnedIndexForEntryUse() {
        if (lastReturnedIndex < 0)
            throw new IllegalStateException("Entry was removed");
    }
}

This is possible, since the Map.Entry specification states (emphasis by me):

A map entry (key-value pair). The Map.entrySet method returns a collection-view of the map, whose elements are of this class. The only way to obtain a reference to a map entry is from the iterator of this collection-view. These Map.Entry objects are valid only for the duration of the iteration ; more formally, the behavior of a map entry is undefined if the backing map has been modified after the entry was returned by the iterator, except through the setValue operation on the map entry.

If you want all entries at once, you'll have to use map.entrySet().toArray() , which may create immutable copies of the entries.


Here some more observations about the default maps (all in OpenJDK 1.6.0_20 as found in Ubuntu's openjdk6-source package):

  • The general purpose maps HashMap and TreeMap (as well as the legacy Hashtable ) are already using some kind of Entry objects as part of their internal structure (the table or tree), so they simple let these objects implement Map.Entry and return them. They are not created on the fly by the Iterator.

    The same is valid for WeakHashMap (where having an Entry object in a strong reference does not avoid its key to get garbage-collected, if I understand right - but as long as you don't call next() on the iterator, the iterator holds the key in the current entry).

  • IdentityHashMap is internally using a simple Object[] , with alternating key and value, so no entry objects here, too, and thus also a reusing of the iterator as entry.

  • ConcurrentSkipListMap is using Node objects which do not implement anything, so its iterators return new AbstractMap.SimpleImmutableEntry<K,V>(n.key, v); . This implies you can't use their setValue() method, as explained in the class documentation:

    All Map.Entry pairs returned by methods in this class and its views represent snapshots of mappings at the time they were produced. They do not support the Entry.setValue method. (Note however that it is possible to change mappings in the associated map using put , putIfAbsent , or replace , depending on exactly which effect you need.)

  • ConcurrentHashMap internally uses a HashEntry class analogously to the HashMap, but this does not implement anything. Additionally, there is an internal class WriteThroughEntry (extending AbstractMap.SimpleEntry ), whose setValue() method delegates to the put method of the map. The iterator returns new objects of this WriteThroughEntry class.

Usually, small, short lived objects are almost free. Consider f1 and f2

static Entry f1(int i){ return new Entry(i); }

static Entry entry = new Entry(0);
static Entry f2(int i){ entry.i=i; return entry; }

static class Entry
{
    Entry(int i){ this.i=i; }
    int i;
    int get(){ return i; }
}

This is a realistic test case of the problem you described - reusing the same object per iteration, vs. creating a new object per iteration. In both cases, some data is saved in the object, carried over to the call site to be read.

Let's profile it, retrieve a billion entries, and read data stored in each, in 3 different ways

    int r = 0;
    for(int i=0; i<1000000000; i++)
    {
    test0:  r += i;
    test1:  r += f1(i).get();
    test2:  r += f2(i).get();
    } 
    print(r);

The number I got is, test2 is as fast as test0 ; test1 is slower than test2 by only one cpu cycle per iteration . ( I guess the difference is several machine instructions, and CPU pipelines them in one cycle)

If you still don't believe it, implement fully your proposed "efficient" solution, compare it to the presumably "wasteful" implementation, and see the difference for yourself. You will be amazed.

Google Collection's ArrayListMultimap is fairly efficient and isn't resource intensive, http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ArrayListMultimap.html

Creating a Multimap

private Multimap<Integer, String> store = ArrayListMultimap.create();

Iterating the Multimap

for (Map.Entry<Integer, String> entry: store.entries()) {}

And if you'd rather avoid Map.Entry, then extract the keyset and go from there:

List<Integer> keys = new ArrayList<Integer>(store.keySet());
for(Long key : keys){
     ArrayList<String> stored_strings = new ArrayList<String>(store.get(key));
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM