简体   繁体   English

Java parallelStream不使用预期的线程数

[英]Java parallelStream does not use expected number of threads

Java 8 parallelStream seems to use more threads than the ones specified by the system property java.util.concurrent.ForkJoinPool.common.parallelism . Java 8 parallelStream似乎使用的线程数多于系统属性java.util.concurrent.ForkJoinPool.common.parallelism指定的线程数。 These unit tests show that I process tasks using the desired number of threads using my own ForkJoinPool but when using parallelStream the number of threads is higher than expected. 这些单元测试显示我使用自己的ForkJoinPool使用所需数量的线程处理任务,但是当使用parallelStream时,线程数高于预期。

import org.junit.Test;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;

import static org.junit.Assert.assertTrue;

public class ParallelStreamTest {

    private static final int TOTAL_TASKS = 1000;

    @Test
    public void testParallelStreamWithParallelism1() throws InterruptedException {
        final Integer maxThreads = 1;
        System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", maxThreads.toString());
        List<Integer> objects = new ArrayList<>();
        for (int i = 0; i < 1000; i++) {
            objects.add(i);
        }

        final AtomicInteger concurrentThreads = new AtomicInteger(0);
        final AtomicInteger taskCount = new AtomicInteger(0);

        objects.parallelStream().forEach(i -> {
            processTask(concurrentThreads, maxThreads); //expected to be called one at the time
            taskCount.addAndGet(1);
        });

        assertTrue(taskCount.get() == TOTAL_TASKS);
    }

    @Test
    public void testMyOwnForkJoinPoolWithParallelism1() throws InterruptedException {
        final Integer threads = 1;
        List<Integer> objects = new ArrayList<>();
        for (int i = 0; i < TOTAL_TASKS; i++) {
            objects.add(i);
        }

        ForkJoinPool forkJoinPool = new ForkJoinPool(1);
        final AtomicInteger concurrentThreads = new AtomicInteger(0);
        final AtomicInteger taskCount = new AtomicInteger(0);

        forkJoinPool.submit(() -> objects.parallelStream().forEach(i -> {
            processTask(concurrentThreads, threads); //expected to be called one at the time
            taskCount.addAndGet(1);
        }));
        forkJoinPool.shutdown();
        forkJoinPool.awaitTermination(1, TimeUnit.MINUTES);

        assertTrue(taskCount.get() == TOTAL_TASKS);
    }

    /**
     * It simply processes a task increasing first the concurrentThreads count
     *
     * @param concurrentThreads Counter for threads processing tasks
     * @param maxThreads Maximum number of threads that are expected to be used for processing tasks
     */
    private void processTask(AtomicInteger concurrentThreads, int maxThreads) {
        int currentConcurrentThreads = concurrentThreads.addAndGet(1);
        if (currentConcurrentThreads > maxThreads) {
            throw new IllegalStateException("There should be no more than " + maxThreads + " concurrent thread(s) but found " + currentConcurrentThreads);
        }

        // actual processing would go here

        concurrentThreads.decrementAndGet();
    }
}

There should be only one thread used for processing tasks as the ForkJoinPool has parallelism=1 and java.util.concurrent.ForkJoinPool.common.parallelism=1 . 应该只有一个线程用于处理任务,因为ForkJoinPool具有parallelism=1java.util.concurrent.ForkJoinPool.common.parallelism=1 Therefore both tests should pass but testParallelStreamWithParallelism1 fails with: 因此,两个测试都应该通过,但testParallelStreamWithParallelism1失败:

java.lang.IllegalStateException: There should be no more than 1 concurrent thread(s) but found 2 java.lang.IllegalStateException:应该有不超过1个并发线程,但找到2个

It seems that setting java.util.concurrent.ForkJoinPool.common.parallelism=1 is not working as expected and more than 1 concurrent task is processed simultaneously. 似乎设置java.util.concurrent.ForkJoinPool.common.parallelism = 1没有按预期工作,并且同时处理了多个并发任务。

Any ideas? 有任何想法吗?

The parallelism setting of the Fork/Join pool determines the number of pool worker threads, but since the caller thread, eg the main thread, will work on the jobs too, there is always one more thread when using the common pool. Fork / Join池的并行性设置决定了池工作线程的数量,但由于调用者线程(例如主线程)也将对作业起作用,因此在使用公共池时总会有一个线程。 That's why the default setting of the common pool is “number of cores minus one” to get an actual number of working threads equal to the number of cores. 这就是为什么公共池默认设置是“核心数减去1”,以使实际工作线程数等于核心数。

With your custom Fork/Join pool, the caller thread of the stream operation is already a worker thread of the pool, hence, utilizing it for processing jobs doesn't increase the actual number of working threads. 使用自定义Fork / Join池,流操作的调用者线程已经是池的工作线程,因此,利用它来处理作业不会增加实际工作线程数。

It must be emphasized that the interaction between the Stream implementation and the Fork/Join pool is entirely unspecified as the fact that streams use the Fork/Join framework under the hood is an implementation detail. 必须强调的是,Stream实现和Fork / Join池之间的交互完全没有指定,因为流使用Fork / Join框架的事实是一个实现细节。 There is no guaranty that changing the default pool's properties has any effect on streams nor that calling stream operations from within a custom Fork/Join pool's task will use that custom pool. 无法保证更改默认池的属性对流有任何影响,也不保证在自定义Fork / Join池的任务中调用流操作将使用该自定义池。

Set this parameter as well: 同样设置此参数:

    System.setProperty("java.util.concurrent.ForkJoinPool.common.maximumSpares", "0");

This worked for me. 这对我有用。 Apparently (although not very well documented), there are allowed 'Spare' threads to pick up work from default ForkJoinPool. 显然(虽然没有很好的文档记录),允许“备用”线程从默认的ForkJoinPool中获取工作。

Run this example: 运行此示例:

  IntStream.rangeClosed(0,9).parallel().forEach((i) -> {
      try {
        System.out.println("id - " + Thread.currentThread().getName());
      } catch (Exception e) {
      }
    });

When you used parameter java.util.concurrent.ForkJoinPool.common.parallelism=1 you will see something like 当你使用参数java.util.concurrent.ForkJoinPool.common.parallelism = 1时,你会看到像

id - main
id - main
id - ForkJoinPool.commonPool-worker-1
id - main
id - ForkJoinPool.commonPool-worker-1
id - main
id - ForkJoinPool.commonPool-worker-1
id - main
id - ForkJoinPool.commonPool-worker-1
id - ForkJoinPool.commonPool-worker-1

As you now know streams use common ForkJoinPool(with paralelism=1) and in addition they use current thread also. 正如您现在知道的流使用常见的ForkJoinPool(具有并行性= 1),此外它们也使用当前线程。

You deleted the correct answer from your first posting of this question, so I'll expound and expand on it. 您从第一次发布此问题时删除了正确的答案,因此我将对其进行阐述和扩展。 Your problem is here: int currentConcurrentThreads = concurrentThreads.addAndGet(1); 你的问题在这里: int currentConcurrentThreads = concurrentThreads.addAndGet(1); and here: 和这里:

objects.parallelStream().forEach(i -> {
  processTask(concurrentThreads, maxThreads); //expected to be called one at the time
  taskCount.addAndGet(1);
});

Each thread in the parallel stream invokes processTask . 并行流中的每个线程都调用processTask Each therefore increments concurrentThreads (but for some reason not with https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html#incrementAndGet-- ). 因此,每个都增加concurrentThreads (但出于某种原因,不是使用https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html#incrementAndGet-- )。 Since each runs in parallel, they're all incrementing concurrentThreads before any can decrement it. 由于每个都是并行运行的,因此它们都会在任何可以递减之前递增concurrentThreads So of course you exceed the number of threads you expect. 所以当然你超过了你期望的线程数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM