简体   繁体   中英

Invoking .map() on an infinite stream?

According to Javadocs for SE 8 Stream.map() does the following

Returns a stream consisting of the results of applying the given function to the elements of this stream.

However, a book I'm reading ( Learning Network Programming with Java , Richard M. Reese) on networking implements roughly the following code snippet in an echo server.

Supplier<String> inputLine = () -> {
    try {
        return br.readLine();
    } catch(IOException e) {
        e.printStackTrace();
        return null;
    }
};

Stream.generate(inputLine).map((msg) -> {
    System.out.println("Recieved: " + (msg == null ? "end of stream" : msg));
    out.println("echo: " + msg);
    return msg;
}).allMatch((msg) -> msg != null);

This is supposed to be a functional way to accomplish getting user input to print to the socket input stream. It works as intended, but I don't quite understand how. Is it because map knows the stream is infinite so it lazily executes as new stream tokens become available? It seems like adding something to a collection currently being iterated over by map is a little black magick. Could someone please help me understand what is going on behind the scenes?


Here is how I restated this in order to avoid the confusing map usage. I believe the author was trying to avoid an infinite loop since you can't break out of a forEach.

Stream.generate(inputLine).allMatch((msg) -> {
        boolean alive = msg != null;
        System.out.println("Recieved: " + (alive ? msg : "end of stream"));
        out.println("echo: " + msg);

        return alive;
});

Streams are lazy. Think of them as workers in a chain that pass buckets to each other. The laziness is in the fact that they will only ask the worker behind them for the next bucket if the worker in front of them asks them for it.

So it's best to think about this as allMatch - being a final action, thus eager - asking the map stream for the next item, and the map stream asking the generate stream for the next item, and the generate stream going to its supplier, and providing that item as soon as it arrives.

It stops when allMatch stops asking for items. And it does so when it knows the answer. Are all items in this stream not null? As soon as the allMatch receives an item that is null, it knows the answer is false , and will finish and not ask for any more items. Because the stream is infinite, it will not stop otherwise.

So you have two factors causing this to work the way it work - one is allMatch asking eagerly for the next item (as long as the previous ones weren't null), and the generate stream that - in order to supply that next item - may need to wait for the supplier that waits for the user to send more input.

But it should be said that map shouldn't have been used here. There should not be side effects in map - it should be used for mapping an item of one type to an item of another type. I think this example was used only as a study aid. The much simpler and straightforward way would be to use BufferedReader 's method lines() which gives you a finite Stream of the lines coming from the buffered reader.

Yes - Stream s are setup lazily until and unless you perform a terminal operation (final action) on the Stream . Or simpler:

For as long as the operations on your stream return another Stream, you do not have a terminal operation, and you keep on chaining until you have something returning anything other than a Stream, including void.

This makes sense, as to be able to return anything other than a Stream, the operations earlier in your stream will need to be evaluated to actually be able to provide the data.

In this case, and as per documentation, allMatch returns a boolean , and thus final execution of your stream is required to calculate that boolean. This is the point also where you provide a Predicate limiting your resulting Stream .

Also note that in the documentation it states:

This is a short-circuiting terminal operation .

Follow that link for more information on those terminal operations, but a terminal operation basically means that it will actually execute the operation. Additionally, the limiting of your infinite stream is the 'short-circuiting' aspect of that method.

Here are two the most relevant sentences of the documentation. The snippet you provided is a perfect example of these working together:

  • Stream::generate(Supplier<T> s) says that it returns:

    Returns an infinite sequential unordered stream where each element is generated by the provided Supplier .

  • 3rd dot of Stream package documentation:

    Laziness-seeking. Many stream operations, such as filtering, mapping, or duplicate removal, can be implemented lazily, exposing opportunities for optimization. For example, "find the first String with three consecutive vowels" need not examine all the input strings. Stream operations are divided into intermediate (Stream-producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy.

In a shortcut, this generated stream await the further elements until the terminal operation is reached. As long as the execution inside the supplied Supplier<T> , the stream pipeline continues.

As an example, if you provide the following Supplier , the execution has no chance to stop and will continue infinitely:

Supplier<String> inputLine = () -> {
    return "Hello world";
};

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM