[英]Is it faster to use Predicates to filter a Concurrent Map or a List, using parallelStream?
I have multiple FileMap
objects stored in a List<FileMap>
, with currently about 500,000 objects. 我在List<FileMap>
存储了多个FileMap
对象,目前大约有500,000个对象。
I am using Predicates to filter the List using parallelStream. 我正在使用谓词使用parallelStream过滤列表。 I am now reading the documentation and see there is a function called Collectors.toConcurrentMap()
. 我现在正在阅读文档,看到有一个名为Collectors.toConcurrentMap()
的函数。 I am familiar with ConcurrentHashMap
and knows it is faster because multiple threads divide the map. 我对ConcurrentHashMap
很熟悉,并且知道它更快,因为有多个线程可以划分地图。
Will changing the simple ArrayList
to toConcurrentMap
and then using Predicates with parallelStream work faster ? 将简单的ArrayList
更改为toConcurrentMap
,然后将谓词与parallelStream一起使用会更快吗? Currently If I am using parallelStream on that List and using serialStream it works the same speed. 当前,如果我在该List上使用parallelStream并使用serialStream,则它的运行速度相同。
Map is a collection of key-value
pairs, where keys are unique. Map是key-value
对的集合,其中键是唯一的。 Data you have is not a map, but a list. 您拥有的数据不是地图,而是列表。 There are a lot of problems: 有很多问题:
ConcurrentMap
has extra complexity to ensure thread safety - although it is done in smarter ways than just making all methods synchronized it still affects performance. ConcurrentMap
具有额外的复杂性以确保线程安全-尽管它以比仅使所有方法都同步的更智能的方式完成,但它仍然会影响性能。 Filtering the elements of the list can be heavily (and easily) parallelized. 过滤列表中的元素可以进行大量(且很容易)并行化。 Having n
cores, where n
is a length of the list, you can achieve performance as good as log(n)
- this is of course using specialized parallel algorithms and using graphics cards instead of CPU, as these although less powerful, have thousands of cores. 具有n
核,其中n
是列表的长度,您可以实现与log(n)
一样好的性能-这当然是使用专用的并行算法并使用图形卡而不是CPU,因为它们虽然功能不那么强大,但具有数千个核心。
I have run a few tests on a list with 100 million integers and processing it sequentially took about 700ms, using parallel stream - about 350ms (I guess Java used only 2 threads), while trying to convert a list into ConcurrentMap
has thrown out of memory error after a few minutes. 我对一个具有1亿个整数的列表进行了一些测试,并使用并行流(大约350ms(我猜Java仅使用2个线程))依次处理了约700ms(尝试将列表转换为ConcurrentMap
已耗尽内存)几分钟后出现错误。
You have mentioned that using stream()
and parallelStream()
didn't change the performance. 您已经提到使用stream()
和parallelStream()
不会改变性能。 I would recommend investigating how does Java chooses how many threads to use in parallel stream (and how to change). 我建议调查一下Java如何选择在并行流中使用多少个线程(以及如何更改)。 This is also affected by your resources - running more CPU consuming threads than the number of cores in your CPU will decrease performance due to context switching. 这也受到资源的影响-运行CPU消耗的线程多于CPU内核数会由于上下文切换而降低性能。 I would advise to use only as many threads as the number of cores you have or one fewer - so that one core can be used for all other OS work. 我建议只使用与您拥有的内核数量一样多的线程,或者减少一个线程-以便一个内核可以用于所有其他OS工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.