简体   繁体   English

将对象从流同时添加到两个不同的列表

[英]Add objects from stream to two different lists simultaneously

How can I add objects from one stream to two different lists simultaneously 如何同时将对象从一个流添加到两个不同的列表

Currently I am doing 目前我在做

body.getSurroundings().parallelStream()
                .filter(o -> o.getClass().equals(ResourcePoint.class))
                .map(o -> (ResourcePoint)o)
                .filter(o -> !resourceMemory.contains(o))
                .forEach(resourceMemory::add);

to add objects from my stream into a linkedlist "resourceMemory", but I also want to add the same objects to another list simultaneously, but I can't find the syntax for it. 将我的流中的对象添加到链表“resourceMemory”中,但我也想同时将相同的对象添加到另一个列表,但我找不到它的语法。 Is it possible or do I need to have two copies of this code for each list? 是否可能或者我是否需要为每个列表提供此代码的两个副本?

There are several fundamental errors you should understand first, before trying to expand your code. 在尝试扩展代码之前,首先应该了解一些基本错误。

First of all, forEach does not guaranty a particular order of element processing, so it's likely the wrong tool for adding to a List , even for sequential streams, however, it is completely wrong to use with a parallel stream to add to a collection like LinkedList which is not thread safe, as the action will be performed concurrently . 首先, forEach不保证元素处理的特定顺序,因此它可能是添加到List的错误工具,即使对于顺序流,但是,使用并行流添加到集合中是完全错误的LinkedList这不是线程安全的,因为动作将同时进行。

But even if resourceMemory was a thread safe collection, your code still was broken as there is an interference between your filter condition and the terminal action. 但即使resourceMemory是一个线程安全集合,您的代码仍然被破坏,因为您的filter条件和终端操作之间存在干扰。 .filter(o -> !resourceMemory.contains(o)) queries the same list which you are modifying in the terminal action and it shouldn't be hard to understand how this can brake even with thread-safe collections: .filter(o -> !resourceMemory.contains(o))查询您在终端操作中修改的相同列表,并且应该很难理解即使使用线程安全的集合,它也会如何制动:

Two or more threads may process the filter and find that the element is not contained in the list, then all of them will add the element, contradicting your obvious intention of not having duplicates. 两个或多个线程可以处理过滤器并发现该元素未包含在列表中,然后所有这些都将添加该元素,这与您没有重复的明显意图相矛盾。

You could resort to forEachOrdered which will perform the action in order and non-concurrently: 您可以使用forEachOrdered ,它将按顺序执行操作,而不是同时执行:

body.getSurroundings().parallelStream()
    .filter(o -> o instanceof ResourcePoint)
    .map(o -> (ResourcePoint)o)
    .forEachOrdered(o -> {// not recommended, just for explanation
        if(!resourceMemory.contains(o))
            resourceMemory.add(o);
    });

This will work and it's obvious how you could add to another list within that action, but it's far away from recommended coding style. 这将是有效的,并且很明显如何添加到该操作中的另一个列表,但它远离推荐的编码风格。 Also, the fact that this terminal action synchronizes with all processing threads will destroy any potential benefit of parallel processing, especially as the most expensive operation of this stream pipeline is invoking contains on a LinkedList which will ( must ) happen single-threaded. 此外,此终端操作与所有处理线程同步的事实将破坏并行处理的任何潜在好处,尤其是当此流管道的最昂贵操作调用contains在将必须发生单线程的LinkedList上时。

The correct way to collect stream elements into a list is via, as the name suggests, collect : 将流元素收集到列表中的正确方法是通过,如顾名思义, collect

List<ResourcePoint> resourceMemory
    =body.getSurroundings().parallelStream()
        .filter(o -> o instanceof ResourcePoint)
        .map(o -> (ResourcePoint)o)
        .distinct()                    // no duplicates
        .collect(Collectors.toList()); // collect into a list

This doesn't return a LinkedList , but you should rethink carefully whether you really need a LinkedList . 这不会返回LinkedList ,但您应该仔细重新考虑是否确实需要LinkedList In 99% of all cases, you don't. 在99%的情况下,你没有。 If you really need a LinkedList , you can replace Collectors.toList() with Collectors.toCollection(LinkedList::new) . 如果您确实需要LinkedList ,则可以使用Collectors.toCollection(LinkedList::new)替换Collectors.toList() Collectors.toCollection(LinkedList::new)

Now if you really must add to an existing list created outside of your control, which might already contain elements, you should consider the fact mentioned above, that you have to ensure single-threaded access to a non-thread-safe list anyway, so there's no benefit from doing it from within the parallel stream at all. 现在,如果你真的必须添加到你的控件之外创建的现有列表(可能已经包含元素),你应该考虑上面提到的事实,你必须确保单线程访问非线程安全列表,所以从并行流中完成它没有任何好处。 In most cases, it's more efficient to let the stream work independently from that list and add the result in a single threaded step afterwards: 在大多数情况下,让流独立于该列表工作并在之后的单个线程步骤中添加结果会更有效:

Set<ResourcePoint> newElements=
    body.getSurroundings().parallelStream()
        .filter(o -> o instanceof ResourcePoint)
        .map(o -> (ResourcePoint)o)
        .collect(Collectors.toCollection(LinkedHashSet::new));
newElements.removeAll(resourceMemory);
resourceMemory.addAll(newElements);

Here, we collect into a LinkedHashSet which implies maintenance of the encounter order and sorting out duplicates within the new elements, then use removeAll on the new elements to remove existing elements of the target list (here we benefit from the hash set nature of the temporary collection), finally, the new elements are added to the target list, which, as explained, must happen single-threaded anyway for a target collection which isn't thread safe. 在这里,我们收集到LinkedHashSet ,它意味着维护遭遇顺序并在新元素中排序重复,然后在新元素上使用removeAll来删除目标列表的现有元素(这里我们受益于临时的哈希集性质)最后,新元素被添加到目标列表中,正如所解释的那样,对于非线程安全的目标集合,无论如何必须发生单线程。

It's easy to add the newElements to another target collection with this solution, much easier than writing a custom collector for producing two lists during the stream processing. 使用此解决方案将newElements添加到另一个目标集合很容易,比在流处理期间编写自定义收集器以生成两个列表要容易得多。 But note that the stream operations as written above are way too cheep to assume any benefit from parallel processing. 但请注意,上面写的流操作太过于难以承担并行处理的任何好处。 You would need a very large number of elements to compensate the initial multi-threading overhead. 您需要非常多的元素来补偿初始的多线程开销。 It's even possible that there is no number for which it ever pays off. 甚至有可能没有它能够得到回报的数字。

Instead of 代替

.forEach(resourceMemory::add)

You could invoke 你可以调用

.forEach(o -> {
   resourceMemory.add(o);
   otherResource.add(o);
 })

or put the add operations in a separate method so you could provide a method reference 或者将add操作放在一个单独的方法中,以便提供方法引用

.forEach(this::add)

void add(ResourcePoint p) {
   resourceMemory.add(o);
   otherResource.add(o);
}

But bear in mind, that the order of insertion maybe different with each run as you use a parallel stream. 但请记住,当您使用并行流时,每次运行的插入顺序可能不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM