简体   繁体   中英

Java Stream API lazy evaluation internals

I'm writing an article about the Java Stream API. I've read the whole package documentation for Stream, and I've looked through similar questions here.

If I said the following: "Intermediate operations on a stream are not evaluated until the terminal operation is hit, which will actually perform them," would I be correct? I've seen mixed answers on StackOverflow, and also each intermediate operation returns a Stream, so I was wondering if it just returned itself and then just kept track of the the intermediate operations to perform as it went along. Is this what is meant by "lazy evaluation/execution"?

the following javadoc is giving me mixed signals honestly.. maybe I'm just stupid

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.

Thanks so much for your time; Sorry if I'm missing something ;( I've been stuck on this for a long time!

PS: Stream and lazy evaluation is very similar, but "Each intermediate operation creates a new stream, stores the provided operation/function and return the new stream." So basically, my question is, by new stream, does it mean the same stream that it was given?

so I was wondering if it just returned itself

The intermediate operations do not return the stream that they are given. As the JavaDoc says, they return a new stream , ie a new object.

if it's a new object, how does it know about the previous operations done on the stream?

...you might ask.

Well, objects can have references to other objects, right? Even if you return a new Stream that does "mapping", it can still have a reference to the "old stream" (aka the upstream ). Very simplified example:

class StreamThatDoesMapping<U, T> implements Stream<T> {
    private Stream<U> upstream;
    private Function<? super U, ? extends T> mappingFunction;

    public StreamThatDoesMapping(Stream<U> upstream, Function<? super U, ? extends T> mappingFunction) {
        this.upstream = upstream;
        this.mappingFunction = mappingFunction;
    }

    // implementation details...
}

Clearly, when implementing map , if you then pass the upstream ( this ) to StreamThatDoesMapping , the new stream will "know about" the operations that you did before! This is how you might implement map , again very simplified:

public <R> Stream<R> map(Function<? super T, ? extends R> mapper) {
    return new StreamThatDoesMapping<>(this, mapper);
}

When you do the terminal operation, you are doing it on the "most downstream" Stream object, that object will get elements from its upstream, and that upstream will get elements from its upstream, and so on. Kind of like a linked list, isn't it?

Note that this is only one way for the downstream to know about the upstream. There could be others, but this is what the ReferencePipeline implementation uses. This is what is returned by most stream() methods in the JDK. I strongly recommend checking out their source code.

If I said the following: "Intermediate operations on a stream are not evaluated until the terminal operation is hit, which will actually perform them," would I be correct?

Yes.

I was wondering if it just returned itself and then just kept track of the the intermediate operations to perform as it went along.

No.

It is returning a new Stream object which is defined in terms of the previous Stream object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM