简体   繁体   English

Java 8 Stream:使用多个收集器进行分组

[英]Java 8 Stream: groupingBy with multiple Collectors

I want to use a Java 8 Stream and Group by one classifier but have multiple Collector functions. 我想通过一个分类器使用Java 8 Stream和Group,但是有多个Collector函数。 So when grouping, for example the average and the sum of one field (or maybe another field) is calculated. 因此,在分组时,例如计算一个场(或可能是另一个场)的平均值和总和。

I try to simplify this a bit with an example: 我尝试用一​​个例子来简化这一点:

public void test() {
    List<Person> persons = new ArrayList<>();
    persons.add(new Person("Person One", 1, 18));
    persons.add(new Person("Person Two", 1, 20));
    persons.add(new Person("Person Three", 1, 30));
    persons.add(new Person("Person Four", 2, 30));
    persons.add(new Person("Person Five", 2, 29));
    persons.add(new Person("Person Six", 3, 18));

    Map<Integer, Data> result = persons.stream().collect(
            groupingBy(person -> person.group, multiCollector)
    );
}

class Person {
    String name;
    int group;
    int age;

    // Contructor, getter and setter
}

class Data {
    long average;
    long sum;

    public Data(long average, long sum) {
        this.average = average;
        this.sum = sum;
    }

    // Getter and setter
}

The result should be a Map that associates the result of grouping like 结果应该是一个将分组结果关联起来的Map

1 => Data(average(18, 20, 30), sum(18, 20, 30))
2 => Data(average(30, 29), sum(30, 29))
3 => ....

This works perfectly fine with one function like "Collectors.counting()" but I like to chain more than one (ideally infinite from a List). 这与“Collectors.counting()”之类的函数完美配合,但我喜欢链接多个(理想情况下是List中的无限)。

List<Collector<Person, ?, ?>>

Is it possible to do something like this? 可以这样做吗?

For the concrete problem of summing and averaging, use collectingAndThen along with summarizingDouble : 对于求和平均的具体问题,请使用collectingAndThen连同summarizingDouble

Map<Integer, Data> result = persons.stream().collect(
        groupingBy(Person::getGroup, 
                collectingAndThen(summarizingDouble(Person::getAge), 
                        dss -> new Data((long)dss.getAverage(), (long)dss.getSum()))));

For the more generic problem (collect various things about your Persons), you can create a complex collector like this: 对于更通用的问题(收集关于你人员的各种事情),你可以创建一个像这样的复杂收集器:

// Individual collectors are defined here
List<Collector<Person, ?, ?>> collectors = Arrays.asList(
        Collectors.averagingInt(Person::getAge),
        Collectors.summingInt(Person::getAge));

@SuppressWarnings("unchecked")
Collector<Person, List<Object>, List<Object>> complexCollector = Collector.of(
    () -> collectors.stream().map(Collector::supplier)
        .map(Supplier::get).collect(toList()),
    (list, e) -> IntStream.range(0, collectors.size()).forEach(
        i -> ((BiConsumer<Object, Person>) collectors.get(i).accumulator()).accept(list.get(i), e)),
    (l1, l2) -> {
        IntStream.range(0, collectors.size()).forEach(
            i -> l1.set(i, ((BinaryOperator<Object>) collectors.get(i).combiner()).apply(l1.get(i), l2.get(i))));
        return l1;
    },
    list -> {
        IntStream.range(0, collectors.size()).forEach(
            i -> list.set(i, ((Function<Object, Object>)collectors.get(i).finisher()).apply(list.get(i))));
        return list;
    });

Map<Integer, List<Object>> result = persons.stream().collect(
        groupingBy(Person::getGroup, complexCollector)); 

Map values are lists where first element is the result of applying the first collector and so on. 映射值是列表,其中第一个元素是应用第一个收集器的结果,依此类推。 You can add a custom finisher step using Collectors.collectingAndThen(complexCollector, list -> ...) to convert this list to something more appropriate. 您可以使用Collectors.collectingAndThen(complexCollector, list -> ...)添加自定义修整器步骤Collectors.collectingAndThen(complexCollector, list -> ...)以将此列表转换为更合适的名称。

By using a map as an output type one could have a potentially infinite list of reducers each producing its own statistic and adding it to the map. 通过使用地图作为输出类型,可以有一个潜在的无限减速器列表,每个减速器都会生成自己的统计数据并将其添加到地图中。

public static <K, V> Map<K, V> addMap(Map<K, V> map, K k, V v) {
    Map<K, V> mapout = new HashMap<K, V>();
    mapout.putAll(map);
    mapout.put(k, v);
    return mapout;
}

... ...

    List<Person> persons = new ArrayList<>();
    persons.add(new Person("Person One", 1, 18));
    persons.add(new Person("Person Two", 1, 20));
    persons.add(new Person("Person Three", 1, 30));
    persons.add(new Person("Person Four", 2, 30));
    persons.add(new Person("Person Five", 2, 29));
    persons.add(new Person("Person Six", 3, 18));

    List<BiFunction<Map<String, Integer>, Person, Map<String, Integer>>> listOfReducers = new ArrayList<>();

    listOfReducers.add((m, p) -> addMap(m, "Count", Optional.ofNullable(m.get("Count")).orElse(0) + 1));
    listOfReducers.add((m, p) -> addMap(m, "Sum", Optional.ofNullable(m.get("Sum")).orElse(0) + p.i1));

    BiFunction<Map<String, Integer>, Person, Map<String, Integer>> applyList
            = (mapin, p) -> {
                Map<String, Integer> mapout = mapin;
                for (BiFunction<Map<String, Integer>, Person, Map<String, Integer>> f : listOfReducers) {
                    mapout = f.apply(mapout, p);
                }
                return mapout;
            };
    BinaryOperator<Map<String, Integer>> combineMaps
            = (map1, map2) -> {
                Map<String, Integer> mapout = new HashMap<>();
                mapout.putAll(map1);
                mapout.putAll(map2);
                return mapout;
            };
    Map<String, Integer> map
            = persons
            .stream()
            .reduce(new HashMap<String, Integer>(),
                    applyList, combineMaps);
    System.out.println("map = " + map);

Produces : 产品:

map = {Sum=10, Count=6}

You could chain them, 你可以链接他们,

A collector can only produce one object, but this object can hold multiple values. 收集器只能生成一个对象,但此对象可以包含多个值。 You could return a Map for example where the map has an entry for each collector you are returning. 例如,您可以返回一个Map,其中地图为您要返回的每个收集器都有一个条目。

You can use Collectors.of(HashMap::new, accumulator, combiner); 你可以使用Collectors.of(HashMap::new, accumulator, combiner);

Your accumulator would have a Map of Collectors where the keys of the Map produced matches the name of the Collector. 您的accumulator将有一个收集器映射,其中生成的映射的键与收集器的名称匹配。 Te combiner would need a way to combine multiple result esp when this is performed in parallel. 当并行执行时,组合器需要一种方法来组合多个结果esp。


Generally the built in collectors use a data type for complex results. 通常,内置收集器使用数据类型来获得复杂的结果。

From Collectors 来自收藏家

public static <T>
Collector<T, ?, DoubleSummaryStatistics> summarizingDouble(ToDoubleFunction<? super T> mapper) {
    return new CollectorImpl<T, DoubleSummaryStatistics, DoubleSummaryStatistics>(
            DoubleSummaryStatistics::new,
            (r, t) -> r.accept(mapper.applyAsDouble(t)),
            (l, r) -> { l.combine(r); return l; }, CH_ID);
}

and in its own class 并在自己的班级

public class DoubleSummaryStatistics implements DoubleConsumer {
    private long count;
    private double sum;
    private double sumCompensation; // Low order bits of sum
    private double simpleSum; // Used to compute right sum for non-finite inputs
    private double min = Double.POSITIVE_INFINITY;
    private double max = Double.NEGATIVE_INFINITY;

Instead of chaining the collectors, you should build an abstraction which is an aggregator of collectors: implement the Collector interface with a class which accepts a list of collectors and delegates each method invocation to each of them. 您应该构建一个抽象,而不是链接收集器,抽象是收集器的聚合器:使用接受收集器列表的类实现Collector接口,并将每个方法调用委托给每个收集器。 Then, in the end, you return new Data() with all the results the nested collectors produced. 然后,最后,返回new Data() ,其中包含嵌套收集器生成的所有结果。

You can avoid creating a custom class with all the method declarations by making use of Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics) The finisher lambda will call the finisher of each nested collector, then return the Data instance. 您可以通过使用Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics)合并Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics... characteristics)来避免使用所有方法声明创建自定义类。 finisher lambda将调用每个嵌套收集器的修整器,然后返回Data实例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM