简体   繁体   English

从并行流中收集结果

[英]Collect results from parallel stream

I have a piece of code like this: 我有一段这样的代码:

List<Egg> eggs = hens.parallelStream().map(hen -> {
    ArrayList<Egg> eggs = new ArrayList<>();
    while (hen.hasEgg()) {
        eggs.add(hen.getEgg());
    }
    return eggs;
}).flatMap(Collection::stream).collect(Collectors.toList());

But in this way I have to create an ArrayList for every hen, and eggs are not collected until a hen is 100% processed. 但是通过这种方式,我必须为每只母鸡创建一个ArrayList,并且在母鸡100%处理之前不会收集鸡蛋。 I would like something like this: 我想要这样的东西:

List<Egg> eggs = hens.parallelStream().map(hen -> {
    while (hen.hasEgg()) {
        yield return hen.getEgg();
    }
}).collect(Collectors.toList());

But Java does not have yield return. 但Java没有收益率。 Is there a way to implement it? 有没有办法实现它?

Your Hen class is poorly adapted to the Stream API. 您的Hen类很难适应Stream API。 Provided that you cannot change it and it has no other useful methods (like Collection<Egg> getAllEggs() or Iterator<Egg> eggIterator() ), you can create an egg stream like this: 如果您无法更改它并且没有其他有用的方法(如Collection<Egg> getAllEggs()Iterator<Egg> eggIterator() ),您可以创建如下的egg流:

public static Stream<Egg> eggs(Hen hen) {
    Iterator<Egg> it = new Iterator<Egg>() {
        @Override
        public boolean hasNext() {
            return hen.hasEgg();
        }

        @Override
        public Egg next() {
            return hen.getEgg();
        }
    };
    return StreamSupport.stream(Spliterators.spliteratorUnknownSize(it, 0), false);
}

Now you can use it in the following manner: 现在您可以通过以下方式使用它:

List<Egg> eggs = hens.parallelStream()
                     .flatMap(hen -> eggs(hen))
                     .collect(Collectors.toList());

Of course better Stream implementation might be possible if you can change the Hen class. 当然,如果您可以更改Hen类,则可以实现更好的Stream实现。

The iteration logic using hasEgg() and getEgg() is stateful as these method's results depend on the previous invocations. 使用hasEgg()getEgg()的迭代逻辑是有状态的,因为这些方法的结果取决于先前的调用。 Therefore, processing a single Hen can't be parallelized unless you manage to change the interface completely. 因此,除非您设法完全更改界面,否则无法并行处理单个Hen

That said, your worrying about the ArrayList is unnecessary. 也就是说,你不必担心ArrayList When the stream implementation executes the collect operation in parallel, it has to buffer the values for each thread anyway and combine these buffers afterwards. 当流实现并行执行collect操作时,它必须缓冲每个线程的值,然后组合这些缓冲区。 It might even be the case that the operation doesn't benefit from parallel execution at all. 甚至可能是操作根本没有从并行执行中受益的情况。

What you can do, is to replace the ArrayList by a Stream.Builder as it's optimized for the use case of only adding until constructing the Stream : 你可以做的是用Stream.Builder替换ArrayList ,因为它针对仅在构造Stream之前添加的用例进行了优化:

List<Egg> eggs = hens.parallelStream().flatMap(hen -> {
    Stream.Builder<Egg> eggStream = Stream.builder();
    while(hen.hasEgg()) {
        eggStream.add(hen.getEgg());
    }
    return eggStream.build();
}).collect(Collectors.toList());

Assuming the existence of a getEggs() method, you can use the following to collect all of the eggs. 假设存在getEggs()方法,您可以使用以下方法收集所有鸡蛋。

List<Egg> eggs = hens.parallelStream()
    .filter(Hen::hasEggs)
    .map(Hen::getEggs)
    .collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll);

The code assumes that getEggs() returns a Collection . 该代码假定getEggs()返回一个Collection You could eliminate the filter(Hen::hasEggs) if getEggs() returns an empty Collection when the Hen has no Eggs . 你可以消除filter(Hen::hasEggs)如果getEggs()返回一个空Collection ,当Hen没有Eggs

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM