简体   繁体   English

不通过entrySet()迭代创建太多的Map.Entry实例吗?

[英]Doesn't that iterate thru entrySet() create too many Map.Entry instances?

I'm not sure if HashMap or TreeMap store Map.Entry in itself. 我不确定HashMapTreeMap本身是否存储了Map.Entry That is, it's likely to return Map.Entry instance created on the fly when entrySet().iterator().next() is called. 也就是说,当调用entrySet().iterator().next()时,它可能会返回动态创建的Map.Entry实例。

Personally, I think it may be better in this form: 就个人而言,我认为这种形式可能更好:

class Entry {
    Object key;
    Object value;
}

interface InplaceIterator {
    boolean next();
}

Entry entryBuf = new Entry();
InplaceIterator it = map.entrySet().inplaceIterator(entryBuf);
while (it.next()) {
    // do with entryBuf...
}

Thus, the creation of Entry is avoided. 因此,避免了Entry的创建。

I don't know how Java Compiler works, will Java Compiler optimize the creation of Map.Entry away, by analyzing the dataflow and get the knowledge that Map.Entry can be safely reused? 我不知道Java Compiler是如何工作的,Java Compiler是否会优化Map.Entry的创建,通过分析数据流并获得可以安全地重用Map.Entry的知识?

Or, is someone there have already written another collection framework to enable inplace iteration? 或者,有人已经编写了另一个集合框架来启用inplace迭代吗?

What you describe (having an iterator-local Map.Entry object and reusing it for all next() return values) is one possible Map implementation, and I think some special-purpose maps are using this. 您所描述的内容(具有迭代器本地Map.Entry对象并将其重用于所有next()返回值)是一种可能的Map实现,我认为一些特殊用途的地图正在使用它。

For example, the implementation of EnumMap.entrySet().iterator() (here the version from OpenJDK, 1.6.0_20) simply uses the iterator object itself as the Entry object returned by the next() method: 例如, EnumMap.entrySet().iterator() (这里是OpenJDK的版本,1.6.0_20)的实现只是使用迭代器对象本身作为next()方法返回的Entry对象:

/**
 * Since we don't use Entry objects, we use the Iterator itself as entry.
 */
private class EntryIterator extends EnumMapIterator<Map.Entry<K,V>>
    implements Map.Entry<K,V>
{
    public Map.Entry<K,V> next() {
        if (!hasNext())
            throw new NoSuchElementException();
        lastReturnedIndex = index++;
        return this;
    }

    public K getKey() {
        checkLastReturnedIndexForEntryUse();
        return keyUniverse[lastReturnedIndex];
    }

    public V getValue() {
        checkLastReturnedIndexForEntryUse();
        return unmaskNull(vals[lastReturnedIndex]);
    }

    public V setValue(V value) {
        checkLastReturnedIndexForEntryUse();
        V oldValue = unmaskNull(vals[lastReturnedIndex]);
        vals[lastReturnedIndex] = maskNull(value);
        return oldValue;
    }

    // equals, hashCode, toString

    private void checkLastReturnedIndexForEntryUse() {
        if (lastReturnedIndex < 0)
            throw new IllegalStateException("Entry was removed");
    }
}

This is possible, since the Map.Entry specification states (emphasis by me): 这是可能的,因为Map.Entry规范声明(由我强调):

A map entry (key-value pair). 映射条目(键值对)。 The Map.entrySet method returns a collection-view of the map, whose elements are of this class. Map.entrySet方法返回地图的集合视图,其元素属于此类。 The only way to obtain a reference to a map entry is from the iterator of this collection-view. 获取对映射条目的引用的唯一方法是来自此collection-view的迭代器。 These Map.Entry objects are valid only for the duration of the iteration ; 这些Map.Entry对象仅在迭代期间有效 ; more formally, the behavior of a map entry is undefined if the backing map has been modified after the entry was returned by the iterator, except through the setValue operation on the map entry. 更正式地说,如果在迭代器返回条目后修改了支持映射,则映射条目的行为是未定义的,除非通过映射条目上的setValue操作。

If you want all entries at once, you'll have to use map.entrySet().toArray() , which may create immutable copies of the entries. 如果您想同时使用所有条目,则必须使用map.entrySet().toArray() ,这可能会创建条目的不可变副本。


Here some more observations about the default maps (all in OpenJDK 1.6.0_20 as found in Ubuntu's openjdk6-source package): 这里有一些关于默认映射的更多观察结果(所有这些都在Ubuntu的openjdk6-source包中的OpenJDK 1.6.0_20中):

  • The general purpose maps HashMap and TreeMap (as well as the legacy Hashtable ) are already using some kind of Entry objects as part of their internal structure (the table or tree), so they simple let these objects implement Map.Entry and return them. 通用映射HashMapTreeMap (以及遗留Hashtable )已经使用某种Entry对象作为其内部结构(表或树)的一部分,因此它们很简单,让这些对象实现Map.Entry并返回它们。 They are not created on the fly by the Iterator. 它们不是由Iterator动态创建的。

    The same is valid for WeakHashMap (where having an Entry object in a strong reference does not avoid its key to get garbage-collected, if I understand right - but as long as you don't call next() on the iterator, the iterator holds the key in the current entry). 这同样适用于WeakHashMap (如果我理解正确的话,在强引用中有一个Entry对象不会避免它的密钥被垃圾收集 - 但只要你不在迭代器上调用next() ,迭代器掌握当前条目中的关键字)。

  • IdentityHashMap is internally using a simple Object[] , with alternating key and value, so no entry objects here, too, and thus also a reusing of the iterator as entry. IdentityHashMap在内部使用一个简单的Object[] ,具有交替的键和值,因此这里也没有入口对象,因此也可以重用迭代器作为入口。

  • ConcurrentSkipListMap is using Node objects which do not implement anything, so its iterators return new AbstractMap.SimpleImmutableEntry<K,V>(n.key, v); ConcurrentSkipListMap使用的Node对象没有实现任何东西,因此它的迭代器返回new AbstractMap.SimpleImmutableEntry<K,V>(n.key, v); . This implies you can't use their setValue() method, as explained in the class documentation: 这意味着你不能使用他们的setValue()方法,如类文档中所述:

    All Map.Entry pairs returned by methods in this class and its views represent snapshots of mappings at the time they were produced. 此类中的方法返回的所有Map.Entry对及其视图表示生成时映射的快照。 They do not support the Entry.setValue method. 它们不支持Entry.setValue方法。 (Note however that it is possible to change mappings in the associated map using put , putIfAbsent , or replace , depending on exactly which effect you need.) (但请注意,可以使用putputIfAbsentreplace更改关联映射中的映射,具体取决于您需要的确切效果。)

  • ConcurrentHashMap internally uses a HashEntry class analogously to the HashMap, but this does not implement anything. ConcurrentHashMap内部使用类似于HashMap的HashEntry类,但这并没有实现任何东西。 Additionally, there is an internal class WriteThroughEntry (extending AbstractMap.SimpleEntry ), whose setValue() method delegates to the put method of the map. 此外,还有一个内部类WriteThroughEntry (扩展AbstractMap.SimpleEntry ),其setValue()方法委托给map的put方法。 The iterator returns new objects of this WriteThroughEntry class. 迭代器返回此WriteThroughEntry类的新对象。

Usually, small, short lived objects are almost free. 通常,小的,短暂的物体几乎是免费的。 Consider f1 and f2 考虑f1f2

static Entry f1(int i){ return new Entry(i); }

static Entry entry = new Entry(0);
static Entry f2(int i){ entry.i=i; return entry; }

static class Entry
{
    Entry(int i){ this.i=i; }
    int i;
    int get(){ return i; }
}

This is a realistic test case of the problem you described - reusing the same object per iteration, vs. creating a new object per iteration. 这是您描述的问题的实际测试案例 - 每次迭代重用相同的对象,而不是每次迭代创建一个新对象。 In both cases, some data is saved in the object, carried over to the call site to be read. 在这两种情况下,一些数据都保存在对象中,并传送到呼叫站点进行读取。

Let's profile it, retrieve a billion entries, and read data stored in each, in 3 different ways 让我们分析它,检索十亿个条目,并以三种不同的方式读取存储在每个条目中的数据

    int r = 0;
    for(int i=0; i<1000000000; i++)
    {
    test0:  r += i;
    test1:  r += f1(i).get();
    test2:  r += f2(i).get();
    } 
    print(r);

The number I got is, test2 is as fast as test0 ; 我得到的数字是, test2test0一样快; test1 is slower than test2 by only one cpu cycle per iteration . 每次迭代只有一个cpu周期, test1test2慢。 ( I guess the difference is several machine instructions, and CPU pipelines them in one cycle) (我猜不同的是几个机器指令,CPU在一个周期内管道化)

If you still don't believe it, implement fully your proposed "efficient" solution, compare it to the presumably "wasteful" implementation, and see the difference for yourself. 如果您仍然不相信它,请完全实施您提出的“高效”解决方案,将其与可能的“浪费”实施进行比较,并亲眼看到差异。 You will be amazed. 你会惊讶的。

Google Collection's ArrayListMultimap is fairly efficient and isn't resource intensive, http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ArrayListMultimap.html Google Collection的ArrayListMultimap相当高效且不占用大量资源, http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ArrayListMultimap.html

Creating a Multimap 创建Multimap

private Multimap<Integer, String> store = ArrayListMultimap.create();

Iterating the Multimap 迭代Multimap

for (Map.Entry<Integer, String> entry: store.entries()) {}

And if you'd rather avoid Map.Entry, then extract the keyset and go from there: 如果你宁愿避免Map.Entry,那么提取密钥集并从那里开始:

List<Integer> keys = new ArrayList<Integer>(store.keySet());
for(Long key : keys){
     ArrayList<String> stored_strings = new ArrayList<String>(store.get(key));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM