简体   繁体   中英

Why is certain Collectors in Java stream API called downstream collector?

I wanted to know why we classify certain collectors as "downstream"? Is there an upstream Collector then? Please note that this is not about usage, but trying to understand the logic behind the term "downstream". To me, when you normally deal with stream API usage, all streams down the builder chain looks like they are downstream only.

List<String> list = List.of("AAA","B","CCCCC","DDD", "FFFFFF", "AAA");
List<Integer> res =
            list.stream()
                    .collect(
                            Collectors.mapping(s -> s.length(), // string -> int
                                Collectors.toList())); // downstreaming

In the above code, Collectors.toList() is regarded as downstream.

The term downstream in the documentation refers to one Collector accepting a second Collector as an argument. The argument is applied downstream (after) the Collector that accepts it. In other words, the downstream Collector is applied to the result of the upstream Collector.

In your example, Collectors.toList is downstream from Collectors.mapping .

I often imagine the stream API as building a production line of a product. There are raw materials coming from somewhere ( ArrayList.stream , IntStream.range , Stream.of , whatever), on a conveyer belt, and then with intermediate methods, the materials get transformed ( map / flatMap etc) and filtered ( filter / limit etc) and finally they reach the end of the line, where they get assembled into one final product ( collect ) * .

Collector s are "machines" that build different final products aforementioned. toList builds a list. toSet builds a Set etc. However, other collectors doesn't fully build the big thing, eg groupingBy . groupingBy only groups the materials by a key, and then spits the items out again, as groups, back on the conveyor belt. These collectors need another collector down the production line (aka down the stream) to continue building the final product.

mapping is another one of those collectors that doesn't completely build the final product. It merely transforms the materials and spits them out again, which is kind of like map . It's usefulness comes when you want to, eg transform the groups spitted out from a groupingBy . ie It's mostly useful when you use it as the downstream of another collector.

Is there an upstream Collector then?

Following the production line analogy, the relationship is two way: toList is the downstream of mapping , so mapping is the upstream of toList . In official documentation though. This word isn't mentioned much. I only found it in peek .

* There are other terminal operations, but let's focus on collect , since this is what the question is about.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM