简体   繁体   English

在Java 8中以惯用方式枚举对象流

[英]Idiomatically enumerating a Stream of objects in Java 8

How can one idiomatically enumerate a Stream<T> which maps each T instance to a unique integer using Java 8 stream methods (eg for an array T[] values , creating a Map<T,Integer> where Map.get(values[i]) == i evaluates to true )? 如何习惯地枚举Stream<T> ,它使用Java 8流方法将每个T实例映射到唯一的整数(例如,对于数组T[] values ,创建Map<T,Integer> ,其中Map.get(values[i]) == i评估为true )?

Currently, I'm defining an anonymous class which increments an int field for use with the Collectors.toMap(..) method: 目前,我正在定义一个匿名类,它增加一个int字段以与Collectors.toMap(..)方法一起使用:

private static <T> Map<T, Integer> createIdMap(final Stream<T> values) {
    return values.collect(Collectors.toMap(Function.identity(), new Function<T, Integer>() {

        private int nextId = 0;

        @Override
        public Integer apply(final T t) {
            return nextId++;
        }

    }));
}

However, is there not a more concise/elegant way of doing this using the Java 8 stream API? 但是,使用Java 8流API是否没有更简洁/更优雅的方式? — bonus points if it can be safely parallelized. - 如果可以安全地并行化,则可获得奖励积分。

Your approach will fail, if there is a duplicate element. 如果存在重复元素,您的方法将失败。

Besides that, your task requires mutable state, hence, can be solved with Mutable reduction . 除此之外,您的任务需要可变状态,因此可以通过Mutable减少来解决。 When we populate a map, we can simple use the map's size to get an unused id. 当我们填充地图时,我们可以简单地使用地图的大小来获取未使用的ID。

The trickier part is the merge operation. 更棘手的部分是合并操作。 The following operation simply repeats the assignments for the right map, which will handle potential duplicates. 以下操作只是重复右图的分配,这将处理潜在的重复。

private static <T> Map<T, Integer> createIdMap(Stream<T> values) {
    return values.collect(HashMap::new, (m,t) -> m.putIfAbsent(t,m.size()),
        (m1,m2) -> {
            if(m1.isEmpty()) m1.putAll(m2);
            else m2.keySet().forEach(t -> m1.putIfAbsent(t, m1.size()));
        });
}

If we rely on unique elements, or insert an explicit distinct() , we can use 如果我们依赖于唯一元素,或者插入一个显式distinct() ,我们就可以使用

private static <T> Map<T, Integer> createIdMap(Stream<T> values) {
    return values.distinct().collect(HashMap::new, (m,t) -> m.put(t,m.size()),
        (m1,m2) -> { int leftSize=m1.size();
            if(leftSize==0) m1.putAll(m2);
            else m2.forEach((t,id) -> m1.put(t, leftSize+id));
        });

}

I would do it in this way: 我会这样做:

private static <T> Map<T, Integer> createIdMap2(final Stream<T> values) {
    List<T> list = values.collect(Collectors.toList());
    return IntStream.range(0, list.size()).boxed()
            .collect(Collectors.toMap(list::get, Function.identity()));
}

For sake or parallelism, it can be changed to 为了清酒或并行,可以将其更改为

   return IntStream.range(0, list.size()).parallel().boxed().
                (...)

Comparing to convert the input stream to List first in the solution provided by Andremoniy. 比较在Andremoniy提供的解决方案中首先将输入流转换为List。 I would prefer to do it in different way because we don't know the cost of "toList()" and "list.get(i)", and it's unnecessary to create an extra List, which could be small or bigger 我宁愿以不同的方式做到这一点,因为我们不知道“toList()”和“list.get(i)”的成本,并且没有必要创建一个额外的List,它可以是小的或更大的

private static <T> Map<T, Integer> createIdMap2(final Stream<T> values) {
    final MutableInt idx = MutableInt.of(0); // Or: final AtomicInteger idx = new AtomicInteger(0);        
    return values.collect(Collectors.toMap(Function.identity(), e -> idx.getAndIncrement()));
}

Regardless to the question, I think it's a bad design to pass streams as parameters in a method. 无论问题如何,我认为将流作为参数传递给方法是一种糟糕的设计。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM