简体   繁体   English

使用谓词通过parallelStream筛选并发映射或列表是否更快?

[英]Is it faster to use Predicates to filter a Concurrent Map or a List, using parallelStream?

I have multiple FileMap objects stored in a List<FileMap> , with currently about 500,000 objects. 我在List<FileMap>存储了多个FileMap对象,目前大约有500,000个对象。

I am using Predicates to filter the List using parallelStream. 我正在使用谓词使用parallelStream过滤列表。 I am now reading the documentation and see there is a function called Collectors.toConcurrentMap() . 我现在正在阅读文档,看到有一个名为Collectors.toConcurrentMap()的函数。 I am familiar with ConcurrentHashMap and knows it is faster because multiple threads divide the map. 我对ConcurrentHashMap很熟悉,并且知道它更快,因为有多个线程可以划分地图。

Will changing the simple ArrayList to toConcurrentMap and then using Predicates with parallelStream work faster ? 将简单的ArrayList更改为toConcurrentMap ,然后将谓词与parallelStream一起使用会更快吗? Currently If I am using parallelStream on that List and using serialStream it works the same speed. 当前,如果我在该List上使用parallelStream并使用serialStream,则它的运行速度相同。

Map is a collection of key-value pairs, where keys are unique. Map是key-value对的集合,其中键是唯一的。 Data you have is not a map, but a list. 您拥有的数据不是地图,而是列表。 There are a lot of problems: 有很多问题:

  1. Trying to transform list into a map will require to provide key and value mapping functions. 尝试将列表转换为映射将需要提供键和值映射功能。
  2. You will end up with bigger structure than you had originally. 您最终将获得比原始结构更大的结构。
  3. You will have to ensure that key mapping function returns unique values hence making parallelization impossible (you can use synchronization but it will greatly decrease performance). 您将必须确保键映射函数返回唯一值,从而使并行化成为不可能(可以使用同步,但这会大大降低性能)。
  4. A map is more complex structure than a list (which is effectively an array) and constructing it takes much more time. 映射的结构比列表(实际上是数组)要复杂得多,并且构造它要花费更多的时间。
  5. ConcurrentMap has extra complexity to ensure thread safety - although it is done in smarter ways than just making all methods synchronized it still affects performance. ConcurrentMap具有额外的复杂性以确保线程安全-尽管它以比仅使所有方法都同步的更智能的方式完成,但它仍然会影响性能。
  6. Iterating over the map has not much to do with how the data is stored - you will need to get a values set anyway. 在地图上进行迭代与数据的存储方式没有多大关系-无论如何,您都需要获取一个设置值。

Filtering the elements of the list can be heavily (and easily) parallelized. 过滤列表中的元素可以进行大量(且很容易)并行化。 Having n cores, where n is a length of the list, you can achieve performance as good as log(n) - this is of course using specialized parallel algorithms and using graphics cards instead of CPU, as these although less powerful, have thousands of cores. 具有n核,其中n是列表的长度,您可以实现与log(n)一样好的性能-这当然是使用专用的并行算法并使用图形卡而不是CPU,因为它们虽然功能不那么强大,但具有数千个核心。

I have run a few tests on a list with 100 million integers and processing it sequentially took about 700ms, using parallel stream - about 350ms (I guess Java used only 2 threads), while trying to convert a list into ConcurrentMap has thrown out of memory error after a few minutes. 我对一个具有1亿个整数的列表进行了一些测试,并使用并行流(大约350ms(我猜Java仅使用2个线程))依次处理了约700ms(尝试将列表转换为ConcurrentMap已耗尽内存)几分钟后出现错误。

You have mentioned that using stream() and parallelStream() didn't change the performance. 您已经提到使用stream()parallelStream()不会改变性能。 I would recommend investigating how does Java chooses how many threads to use in parallel stream (and how to change). 我建议调查一下Java如何选择在并行流中使用多少个线程(以及如何更改)。 This is also affected by your resources - running more CPU consuming threads than the number of cores in your CPU will decrease performance due to context switching. 这也受到资源的影响-运行CPU消耗的线程多于CPU内核数会由于上下文切换而降低性能。 I would advise to use only as many threads as the number of cores you have or one fewer - so that one core can be used for all other OS work. 我建议只使用与您拥有的内核数量一样多的线程,或者减少一个线程-以便一个内核可以用于所有其他OS工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用parallelstream()在Java 8中填充Map是否安全 - Is it safe to use parallelstream() to populate a Map in Java 8 尝试使用 Map.values().parallelStream().forEach(list -&gt; list.sort(comparator)) 但出现错误:“比较方法违反其一般合同!” - Trying to use Map.values().parallelStream().forEach(list -> list.sort(comparator)) but get error: "Comparison method violates its general contract!" 如何在Java8 Streams中通过多个过滤谓词比较两个Map列表以识别匹配和不匹配的记录 - How to compare Two List of Map to identify the matching and non matching records with multiple filter predicates in Java8 Streams 并发执行:Future vs parallelstream - Concurrent Execution: Future vs parallelstream 带有过滤器链接的 ParallelStream - ParallelStream with filter chaining 使用 Java 8 谓词的 JPA 存储库过滤器 - JPA Repository filter using Java 8 Predicates 使用带有DozerMaper的parallelStream的对象列表映射给出了StackOverflowError - Mapping list of objects using parallelStream with DozerMaper gives StackOverflowError 如何使用parallelStream以与原始列表相同的顺序获取响应 - How to get responses in the same order as the original list using parallelStream 用于并发数据库/REST 调用的 Java 8 parallelStream - Java 8 parallelStream for concurrent Database / REST call 如何通过Java 8 Lambda在列表中使用地图过滤器 - How to use Map filter in list by Java 8 lambda
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM