原因不明的Java Hashmap行为

Question

在下面的代码中，我创建了一个hashmap来存储名为Datums的对象，它包含一个String（位置）和一个count。 不幸的是，代码给出了非常奇怪的行为。

            FileSystem fs = FileSystem.get(new Configuration());
            Random r = new Random();
            FSDataOutputStream fsdos = fs.create(new Path("error/" + r.nextInt(1000000)));

            HashMap<String, Datum> datums = new HashMap<String, Datum>();
            while (itrtr.hasNext()) {
                Datum next = itrtr.next();
                synchronized (datums) {
                    if (!datums.containsKey(next.location)) {
                        fsdos.writeUTF("INSERTING: " + next + "\n");
                        datums.put(next.location, next);
                    } else {
                    } // skit those that are already indexed 
                }
            }
            for (Datum d : datums.values()) {
                fsdos.writeUTF("PRINT DATUM VALUES: " + d.toString() + "\n");
            }

hashmap将字符串作为键。

这是我在错误文件中得到的输出（示例）：

INSERTING: (test.txt,3)

INSERTING: (test2.txt,1)

PRINT DATUM VALUES: (test.txt,3)

PRINT DATUM VALUES: (test.txt,3)

The correct output for the print should be:
INSERTING: (test.txt,3)

INSERTING: (test2.txt,1)

PRINT DATUM VALUES: (test.txt,3)

PRINT DATUM VALUES: (test2.txt,1)

以test2.txt作为位置的Datum发生了什么？ 为什么它会被test.txt取代？

基本上，我不应该两次看到相同的位置。 （这就是！datums.containsKey正在检查的内容）。 不幸的是，我的行为非常奇怪。

顺便说一下，这是在Hadoop上的减速器。

我尝试将同步放在这里，以防它在多个线程中运行，据我所知，它不是。 不过，同样的事情发生了。

Answer 1

这不是地图的问题，而是代码datums.put（next.location，next）; insert作为后面的chnaged的值引用:)这就是为什么最后地图中的所有值都等于地图中最后处理的数据

Answer 2

根据这个答案， Hadoop的迭代器总是返回相同的对象，而不是创建一个新的对象，每次循环返回。

因此，保持对迭代器返回的对象的引用是无效的，并且将产生令人惊讶的结果。 您需要将数据复制到新对象：

        while (itrtr.hasNext()) {
            Datum next = itrtr.next();
            // copy any values from the Datum to a fresh instance
            Datum insert = new Datum(next.location, next.value);
            if (!datums.containsKey(insert.location)) {
                datums.put(insert.location, insert);
            }
        }

以下是对Hadoop Reducer文档的引用，该文档证实了这一点：

该框架将重用传递给reduce的键和值对象，因此应用程序应克隆他们想要保留副本的对象。

原因不明的Java Hashmap行为

问题描述

2 个解决方案

解决方案1
2 2013-12-16 19:10:43

解决方案2
2 已采纳 2013-12-16 19:32:47

原因不明的Java Hashmap行为

问题描述

2 个解决方案

解决方案1 2 2013-12-16 19:10:43

解决方案2 2 已采纳 2013-12-16 19:32:47

解决方案1
2 2013-12-16 19:10:43

解决方案2
2 已采纳 2013-12-16 19:32:47