简体   繁体   English

我的对象列表应该有多大才能保证使用 java 8 的 parallelStream?

[英]How large should my list of objects be to warrant the use of java 8's parallelStream?

I have a list of objects from the database and i want to filter this list using the filter() method of the Stream class. New objects will be added to the database continuously so the list of objects could potentially become very large, possibly thousands of objects.我有一个来自数据库的对象列表,我想使用Stream class 的filter()方法过滤此列表。新对象将不断添加到数据库中,因此对象列表可能会变得非常大,可能有数千个对象。 I want to use a parallelStream to speed up the filter process but i was wondering how large the object list should approximately be to make the use of parallelStream benificial.我想使用parallelStream来加速过滤过程,但我想知道 object 列表大约应该有多大才能使 parallelStream 有益。 I've read this thread about it: Should I always use a parallel stream when possible?我读过这个线程: 我应该尽可能使用并行 stream 吗? And in this thread they agree that the dataset should be really large if you want to have any benefit from using a parallel stream. But how large is large?在此线程中,他们同意如果您想从使用并行 stream 中获益,则数据集应该非常大。但是多大才算大? Say I have 200 records stored in my database and i retrieve them all for filtering, is using a parallelstream justified in this case?假设我的数据库中存储了 200 条记录,我将它们全部检索出来进行过滤,在这种情况下使用并行流是否合理? If not, how large should the dataset be?如果不是,数据集应该有多大? a 1000? 1000? 2000 perhaps?也许2000? I'd love to know.我很想知道。 Thank you.谢谢你。

According to this and depending on the operation it would require at least 10_000, but not elements; 根据 ,并根据操作起来至少需要10_000,但不是元素; instead N * Q where N = number of elements and Q = cost per element . 取而代之的是N * Q ,其中N = number of elements Q = cost per element

But this is a general formula you push against, without measuring this is close to impossible to say (read guess here); 但是,这是您要遵循的通用公式,如果不对其进行度量,这几乎是不可能说的(在此处阅读猜测); proper tests will prove you wrong or right. 适当的测试将证明您是对还是错。

For some simple operations, it is almost never the case when you would actually need parallel processing for the purpose of speed-up. 对于某些简单的操作, 几乎根本就不需要为加速而实际上需要并行处理的情况。

Some other things to mention here, is that this heavily depends on the source - how easy it is to split. 这里要提到的其他事情是,这在很大程度上取决于来源-拆分有多容易。 Anything array-based or index-based are easy to split (and fast), but a Queue or lines from a File do not, so you will probably lose more time splitting rather than computing, unless, of course, there are enough elements to cover for this. 任何基于阵列或者基于索引很容易分裂(和快),但一个Queue或从线File没有,所以你可能会失去更多的时间分裂,而不是计算,除非,当然,也有足够的元素来为此掩盖。 And enough is something you actually measure. 实际测量的东西就足够了。

from ' Modern java in Action ': "Although it may seem odd at first, often the fastest way to filter a collection...is to convert it to a stream, process it in parallel, and then convert it back to a list"来自“ Modern java in Action ”:“虽然起初看起来很奇怪,但过滤集合的最快方法通常是将其转换为 stream,并行处理,然后将其转换回列表”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Java parallelStream中使用print(“\\ r”+ progressMessage)? - How to use print(“\r”+progressMessage) in a Java parallelStream? 如何在 Java 和 logback 中将 MDC 与 parallelStream 一起使用 - How to use MDC with parallelStream in Java and logback Java8的Collection.parallelStream如何工作? - How Java8's Collection.parallelStream works? 如何减少Java parallelStream中的线程数? - How to reduce # of threads in Java parallelStream? Akka:我应该在演员中使用parallelStream还是执行程序 - Akka: Should I use parallelStream or executors in an actor 使用parallelstream()在Java 8中填充Map是否安全 - Is it safe to use parallelstream() to populate a Map in Java 8 Java parallelStream不使用预期的线程数 - Java parallelStream does not use expected number of threads Java 8中的Streams:在集群上使用parallelstream()的简单解决方案? - Streams in Java 8: simple solution to use parallelstream() on a cluster? 如何在Java中使用对象列表的列表? - How to use a list of a list of Objects in java? 我正在尝试在 Java 上使用 gson 将大量自定义对象写入 json 文件,但它会在文件完成之前切断? - I'm trying to use gson on Java to write a large list of custom objects to a json file, but it cuts off before the file's finished?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM