简体   繁体   English

并行处理何时克服顺序处理?

[英]When does parallel processing overcome sequential processing?

//    parallel processing

    int processors = Runtime.getRuntime().availableProcessors();
    ExecutorService executorService = Executors.newFixedThreadPool(threads);


    final List<String> albumIds2 = new ArrayList<String>();
    long start2 = System.nanoTime();
    for (final HColumn<String, String> column : result.get().getColumns()) {
        Runnable worker = new Runnable() {

            @Override
            public void run() {
                albumIds2.add(column.getName());
            }
        };
        executorService.execute(worker);
    }
    long timeTaken2 = System.nanoTime() - start2;

i have code like the above example which creates a List<String> of album ids. 我有像上面的例子一样的代码,它创建了专辑ID的List<String> the column is a slice from cassandra database. 该列是来自cassandra数据库的切片。 i record the time taken for the whole list of albums to be created. 我记录要创建的整个专辑列表所用的时间。

the same i have done using the enhanced for loop like below. 我使用增强的for循环完成了同样的操作,如下所示。

        QueryResult<ColumnSlice<String, String>> result =  CassandraDAO.getRowColumns(AlbumIds_CF, customerId);
    long start = System.nanoTime();
    for (HColumn<String, String> column : result.get().getColumns()) {
        albumIds.add(column.getName());
    }
    long timeTaken = System.nanoTime() - start;

i am noting that no matter how large the number of albums, the for each loop always taking a shorter time to complete. 我注意到,无论专辑的数量有多大,每个循环总是花费更短的时间来完成。 Am i doing it wrong? 我做错了吗? or do i need a computer with multiple cores. 或者我需要一台具有多个核心的计算机。 I am really new to the whole concept of parallel computing please do pardon me if my question is stupid. 我对并行计算的整个概念真的很陌生,请原谅我,如果我的问题是愚蠢的。

In your paralell example, you are submitting one task for each column. 在您的并行示例中,您要为每列提交一个任务。 The overhead of enqueing the task is probably greater than the benefit of paralell execution. 排队任务的开销可能大于并行执行的好处。 This is exacerbated by the "task" being really a fast one (insert a single element into an array and return). “任务”实际上是一个快速的(将一个元素插入一个数组并返回)会加剧这种情况。 Indeed, the Executor adds each received task into a queue (and that addition is costly). 实际上, Executor将每个接收到的任务添加到队列中(并且这种添加成本很高)。 Then you are adding N task to a queue, and each task adds an element to the array. 然后,您将N任务添加到队列,每个任务都向该数组添加一个元素。 The concurrent operation performs only the latter part 并发操作仅执行后一部分

If the task were more complex, you could submit the work in "chunks" (for instance, if you have N elements and P processors, each chunk would have N/P elements or N/P+1 elements). 如果任务更复杂,您可以以“块”提交工作(例如,如果您有N个元素和P个处理器,则每个块将具有N / P元素或N / P + 1个元素)。 That strategy helps reducing the overhead. 该策略有助于减少开销。

Note also that ArrayList is not synchronized, then the concurrent execution of several tasks may corrupt your list. 另请注意, ArrayList未同步,然后并发执行多个任务可能会损坏您的列表。 You could use a concurrent collection for avoiding this issue, but the first observation remains. 您可以使用并发集合来避免此问题,但第一个观察仍然存在。

这是一个不好的做法,创建线程所消耗的时间和CPU比线程正在做的要多得多:albumIds2.add(column.getName());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM