并行迭代大哈希图

Question

我有一个链接的哈希图，最多可以包含30万条记录。 我想并行迭代此映射以提高性能。 该函数遍历向量图，并针对地图中的所有向量找到给定向量的点积。 还可以根据日期值再检查一次。 该函数返回一个嵌套的哈希图。 Ť

这是使用迭代器的代码：

public HashMap<String,HashMap<String,Double>> function1(String key, int days) {
    LocalDate date = LocalDate.now().minusDays(days);
    HashMap<String,Double> ret = new HashMap<>();
    HashMap<String,Double> ret2 = new HashMap<>();
    OpenMapRealVector v0 = map.get(key).value;
    for(Map.Entry<String, FixedTimeHashMap<OpenMapRealVector>> e: map.entrySet()) {
        if(!e.getKey().equals(key)) {
            Double d = v0.dotProduct(e.getValue().value);
            d = Double.parseDouble(new DecimalFormat("###.##").format(d));
            ret.put(e.getKey(),d);
            if(e.getValue().date.isAfter(date)){
                ret2.put(e.getKey(),d);
            }
        }
    }
    HashMap<String,HashMap<String,Double>> result = new HashMap<>();
    result.put("dot",ret);
    result.put("anomaly",ret2);
    return result;
}

更新：我研究了Java 8流，但是在使用并行流时会遇到CastException和Null指针异常，因为正在修改此映射。

码：

public HashMap<String,HashMap<String,Double>> function1(String key, int days) {
    LocalDate date = LocalDate.now().minusDays(days);
    HashMap<String,Double> ret = new HashMap<>();
    HashMap<String,Double> ret2 = new HashMap<>();
    OpenMapRealVector v0 = map.get(key).value;
    synchronized (map) {
        map.entrySet().parallelStream().forEach(e -> {
            if(!e.getKey().equals(key)) {
                Double d = v0.dotProduct(e.getValue().value);
                d = Double.parseDouble(new DecimalFormat("###.##").format(d));
                ret.put(e.getKey(),d);
                if(e.getValue().date.isAfter(date)) {
                    ret2.put(e.getKey(),d);
                }
            }
        });
    }
}

我已经同步了地图的用法，但是它仍然给我以下错误：

java.util.concurrent.ExecutionException: java.lang.ClassCastException
Caused by: java.lang.ClassCastException
Caused by: java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode

另外，我在想是否应该将地图拆分为多个部分，并使用不同的线程并行运行每个部分？

Answer 1

您需要从地图中检索Set<Map.Entry<K, V>> 。

这是您在Java8中使用并行流在Map上进行迭代的方式：

Map<String, String> myMap = new HashMap<> ();
myMap.entrySet ()
    .parallelStream ()
    .forEach (entry -> {
        String key = entry.getKey ();
        String value = entry.getValue ();
        // here add whatever processing you wanna do using the key / value retrieved
        // ret.put (....);
        // ret2.put (....)
    });

澄清：

映射ret和ret2应该声明为ConcurrentHashMap以允许从多个线程并发插入/更新。

因此，这两个地图的声明变为：

Map<String,Double> ret = new ConcurrentHashMap<> ();
Map<String,Double> ret2 = new ConcurrentHashMap<> ();

Answer 2

使用Java 8的一种可能的解决方案是

Map<String, Double> dotMap = map.entrySet().stream().filter(e -> !e.getKey().equals(key))
        .collect(Collectors.toMap(Map.Entry::getKey, e -> Double
                .parseDouble(new DecimalFormat("###.##").format(v0.dotProduct(e.getValue().value)))));
Map<String, Double> anomalyMap = map.entrySet().stream().filter(e -> !e.getKey().equals(key))
        .filter(e -> e.getValue().date.isAfter(date))
        .collect(Collectors.toMap(Map.Entry::getKey, e -> Double
                .parseDouble(new DecimalFormat("###.##").format(v0.dotProduct(e.getValue().value)))));
result.put("dot", dotMap);
result.put("anomaly", anomalyMap);

更新资料

这是更优雅的解决方案，

Map<String, Map<String, Double>> resultMap = map.entrySet().stream().filter(e -> !e.getKey().equals(key))
        .collect(Collectors.groupingBy(e -> e.getValue().date.isAfter(date) ? "anomaly" : "dot",
                Collectors.toMap(Map.Entry::getKey, e -> Double.parseDouble(
                        new DecimalFormat("###.##").format(v0.dotProduct(e.getValue().value))))));

在这里，我们首先根据异常或点对它们进行分组，然后使用下游Collector为每个组创建一个Map 。 我还根据以下建议更新了.filter()标准。

并行迭代大哈希图

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-06-27 16:54:55

解决方案2
2 2018-06-27 17:18:33

并行迭代大哈希图

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-06-27 16:54:55

解决方案2 2 2018-06-27 17:18:33

解决方案1
3 已采纳 2018-06-27 16:54:55

解决方案2
2 2018-06-27 17:18:33