简体   繁体   中英

How do streams / fork-join access arrays thread-safely?

Streams and fork-join both provide functionality to parallelize code that accesses arrays. For example, Arrays.parallelSetAll is implemented largely by the following line:

IntStream.range(0, array.length).parallel()
    .forEach(i -> { array[i] = generator.applyAsLong(i); });

Also, the documentation of RecursiveAction , part of the fork-join framework, contains the following example:

static class SortTask extends RecursiveAction {
    final long[] array; final int lo, hi;
    ...
    void merge(int lo, int mid, int hi) {
        long[] buf = Arrays.copyOfRange(array, lo, mid);
        for (int i = 0, j = lo, k = mid; i < buf.length; j++)
            array[j] = (k == hi || buf[i] < array[k]) ?
                buf[i++] : array[k++];
    }
}

Finally, parallel streams created from arrays access the arrays in multiple threads (the code is too complex to summarize here).

All of these examples appear to read from or write to arrays without any synchronization or other memory barriers (as far as I can tell). As we know, completely ad hoc multithreaded array accesses are unsafe as there is no guarantee that a read reflects a write in another thread unless there is a happens-before relationship between the read and the write. In fact, the Atomic...Array classes were created specifically to address this issue. However, given that each example above is in the standard library or its documentation, I presume they're correct.

Can someone please explain what mechanism guarantees the safety of the array accesses in these examples?

Short answer: partitioning.

The JMM is defined in terms of access to variables . Variables include static fields, instance fields, and array elements. If you arrange your program such that thread T0 is the only thread to access element 0 of an array, and similarly T1 is the only thread to access element 1 of an array, then each of these elements is effectively thread-confined, and you have no problem -- the JMM program order rule takes care of you.

Parallel streams build on this principle. Each task is working on a segment of the array that no other task is working on. Then all we have to do is ensure that the thread running a task can see the initial state of the array, and the consumer of the final result can see the as-modified-by-the-task view of the appropriate section of the array. These are easily arranged through synchronization actions embedded in the implementation of the parallel stream and FJ libraries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM