简体   繁体   English

arraylist 和 parallelStream 的奇怪情况

[英]Weird situation with an arraylist and parallelStream

I have a parallel stream because the task is really slow, I will paste the code below.我有一个并行流,因为任务真的很慢,我将粘贴下面的代码。 The situation is this.情况是这样的。

I have an arrayList, I need to do something with each object in that list (this is slow) and add the object to a temporal list, the process in the stream ends ok, I think, because I can see each object processed with logs.我有一个 arrayList,我需要对该列表中的每个对象做一些事情(这很慢)并将对象添加到临时列表中,我认为流中的过程结束了,因为我可以看到每个对象都用日志处理过.

When the stream ends, sometimes, the temporal list has n-1 objects or one as null.当流结束时,有时,时间列表有 n-1 个对象或一个为 null。

Any idea?任何的想法?

With this sample code the errors are not happening, but the logic is the same but without the business logic.使用此示例代码,错误不会发生,但逻辑相同但没有业务逻辑。

public class SampleCode {
    public List<SomeObject> example(List<SomeObject> someObjectList) {
        List<SomeObject> someObjectListTemp = new ArrayList<>();
        someObjectList.parallelStream().forEach(someObject -> {
            List<ExtraData> extraDataList = getExtraData(someObject.getId());
            if (extraDataList.isEmpty()) {
                someObjectListTemp.add(someObject);
            } else {
                for (ExtraData extraData : extraDataList) {
                    SomeObject someObjectTemp = null;
                    someObjectTemp = (SomeObject) cloneObject(someObject);
                    if (extraData != null) {
                        someObjectTemp.setDate(extraData.getDate());
                        someObjectTemp.setData2(extraData.getData2());
                    }
                    if (someObjectTemp == null) {
                        System.out.println("Warning null object"); //I NEVER see this
                    }
                    someObjectListTemp.add(someObjectTemp);
                    System.out.println("Added object to list"); //I Always see this the same times as elements in original list
                }
            }
        });

        if (someObjectListTemp.size() < 3) {
            System.out.println("Error: There should be at least 3 elements"); //Some times one object is missing in the list
        }

        for (SomeObject someObject : someObjectListTemp) {
            if (someObject == null) {
                System.out.println("Error: null element in list"); //Some times one object is null in the list
            }
        }

        return someObjectListTemp;
    }

Could you try to use the flatMap method instead of foreach ?您可以尝试使用flatMap方法而不是foreach吗? flatMap takes a list of lists and put all their elements in a single list. flatMap接受一个列表列表并将它们的所有元素放在一个列表中。

This way you do not use another ArrayList to store your temporary objects.这样您就不会使用另一个ArrayList来存储您的临时对象。 I feel that this might be the issue, because parallelStream is multi threading and ArrayList is not synchronised我觉得这可能是问题所在,因为parallelStream是多线程的,而ArrayList不是同步的

List<SomeObject> someObjectListTemp = someObjectList.parallelStream()
    .map(so -> processSomeObject(so)) // makes a stream of lists (Stream<List<SomeObject>>)
    .flatMap(Collection::stream) // groups all the elements of all the lists in one stream (Stream<Someobject>)
    .collect(Collectors.toList()); // transforms the stream into a list (List<SomeObject>)

And stick your code in a separate method processSomeObject which returns a list of SomeObject :并将您的代码粘贴在一个单独的方法processSomeObject ,该方法返回SomeObject列表:

static List<SomeObject> processSomeObject(SomeObject someObject) {
    List<ExtraData> extraDataList = getExtraData(someObject.getId());
    List<SomeObject> someObjectListTemp = new ArrayList<>();
    if (extraDataList.isEmpty()) {
        someObjectListTemp.add(someObject);
    } else {
        for (ExtraData extraData : extraDataList) {
            SomeObject someObjectTemp = (SomeObject) cloneObject(someObject);
            if (extraData != null) {
                someObjectTemp.setDate(extraData.getDate());
                someObjectTemp.setData2(extraData.getData2());
            }
            someObjectListTemp.add(someObjectTemp);
            System.out.println("Added object to list");
        }
    }

    return someObjectListTemp;
}

A small example would be一个小例子是

public static void main(String[] args) {
    List<Object> test = new ArrayList<>();
    IntStream.range(0, 100000).parallel().forEach(i -> test.add(new Object()));
    for(Object o : test) {
        System.out.println(o.getClass());
    }
}

i'ts because ArrayList is not threadsafe and the internal array gets screwed我不是因为 ArrayList 不是线程安全的,而且内部数组被搞砸了

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM