简体   繁体   English

Java:使用lambda在流中查找多个最小/最大属性值

[英]Java: Find multiple min/max attribute values in a stream using lambda

I'm looking for a concise way to find a set of attribute values, that are minimal or maximal in a given stream of objects. 我正在寻找一种简洁的方法来查找一组属性值,这些属性值在给定的对象流中是最小的或最大的。

For example: 例如:

class Dimensions {
    final int startX, startY, endX, endY; //Set by constructor
}

/**
 * For the given dimensions, looks where the dimensions intersect. 
 * These coordinates define the sub-array, which is applied to the given function. 
 * 
 * @return the value returned by applying the sub-array in the given dimensions to the given function
 */
<S, T> T performOnIntersections(Function<S, T> function, S[][] inputArray, Dimensions...dimensions){

    int maxStartX = Arrays.stream(dimensions).max(Comparator.comparingInt(d -> d.startX)).get().startX;
    int maxStartY = Arrays.stream(dimensions).max(Comparator.comparingInt(d -> d.startY)).get().startY;
    int minEndX = Arrays.stream(dimensions).min(Comparator.comparingInt(d -> d.endX)).get().endX;
    int minEndY = Arrays.stream(dimensions).min(Comparator.comparingInt(d -> d.endY)).get().endY;

    return applyInBetween(inputArray, function, maxStartX, maxStartY, minEndX, minEndY);
}

This is very redundant, as I have to create a new stream for every minimal/maximal attribute I need. 这是非常多余的,因为我必须为我需要的每个最小/最大属性创建一个新流。

In my usecase, a similar method is part of an recursive algorithm of exponential costs, so having a concurrent solution, that opens the stream just once would be great. 在我的用例中,类似的方法是指数成本的递归算法的一部分,因此有一个并发解决方案,只打开一次流将是伟大的。 Even better would be a solution, that works on an existing stream without termination (but I doubt that's possible). 更好的是一个解决方案,可以在没有终止的现有流上工作(但我怀疑这是可能的)。

Do you have an idea how to improve it? 你知道如何改进吗?

EDIT: I forgot to mention, that Dimension is immutable, which is relevant when using a Supplier . 编辑:我忘了提到, Dimension是不可变的,这在使用Supplier时是相关的。

EDIT 2: Calling collect() on the stream using a lambda expression rather than creating an instance of DimensionsMinMaxCollector has the best runtime performance. 编辑2:使用lambda表达式调用流上的collect()而不是创建DimensionsMinMaxCollector的实例具有最佳的运行时性能。 jessepeng mentioned it first, so I marked his post as solution. jessepeng首先提到它,所以我将他的帖子标记为解决方案。 My implementation is now: 我的实现现在是:

return Arrays.stream(dimensions)
             .collect(() -> new int[4], (array, dimension) -> {
        array[0] = Math.max(array[0], dimension.startX);
        array[1] = Math.min(array[1], dimension.endX);
        array[2] = Math.max(array[2], dimension.startY);
        array[3] = Math.min(array[3], dimension.endY);
}, (a, b) -> {
        a[0] = Math.max(a[0], b[0]);
        a[1] = Math.min(a[1], b[1]);
        a[2] = Math.max(a[2], b[2]);
        a[3] = Math.min(a[3], b[3]);
});

You can use collect() to combine all the elements of the stream into a single Dimensions object that holds the desired values. 您可以使用collect()将流的所有元素组合到一个包含所需值的Dimensions对象中。

From the Stream documentation: 来自Stream文档:

 <R> R collect(Supplier<R> supplier, BiConsumer<R, ? super T> accumulator, BiConsumer<R, R> combiner); 

Performs a mutable reduction operation on the elements of this stream. 对此流的元素执行可变减少操作。 A mutable reduction is one in which the reduced value is a mutable result container, such as an ArrayList, and elements are incorporated by updating the state of the result rather than by replacing the result. 可变减少是其中减少的值是可变结果容器(例如ArrayList),并且通过更新结果的状态而不是通过替换结果来合并元素。 This produces a result equivalent to: 这产生的结果相当于:

  R result = supplier.get(); for (T element : this stream) accumulator.accept(result, element); return result; 

So in your case, you would need a supplier that creates a new Dimension object, and the accumulator and combiner would do the comparing and setting the values. 因此,在您的情况下,您需要一个创建新Dimension对象的供应商,并且累加器和组合器将进行比较和设置值。

Dimensions searchDimensions = Arrays.stream(dimensions).collect(Dimensions::new, (dimension, dimension2) -> {
            dimension.endX = dimension.endX < dimension2.endX ? dimension.endX : dimension2.endX;
            dimension.endY = dimension.endY < dimension2.endY ? dimension.endY : dimension2.endY;
            dimension.startX = dimension.startX > dimension2.startX ? dimension.startX : dimension2.startX;
            dimension.startY = dimension.startY > dimension2.startY ? dimension.startY : dimension2.startY;
        }, (dimension, dimension2) -> {
            dimension.endX = dimension.endX < dimension2.endX ? dimension.endX : dimension2.endX;
            dimension.endY = dimension.endY < dimension2.endY ? dimension.endY : dimension2.endY;
            dimension.startX = dimension.startX > dimension2.startX ? dimension.startX : dimension2.startX;
            dimension.startY = dimension.startY > dimension2.startY ? dimension.startY : dimension2.startY;
        });

return applyInBetween(inputArray, function, searchDimensions.startX, searchDimensions.startY, searchDimensions.endX, searchDimensions.endY);

Edit Since Dimensions is immutable, it is not suitable for performing a mutable reduction operation. 编辑因为Dimensions是不可变的,所以它不适合执行可变缩减操作。 A simple array can be used to store the four values instead. 可以使用简单数组来存储四个值。

<S, T> T performOnIntersections(Function<S, T> function, S[][] inputArray, Dimensions...dimensions){

    Supplier<int[]> supplier = () -> new int[]{Integer.MIN_VALUE, Integer.MIN_VALUE, Integer.MAX_VALUE, Integer.MAX_VALUE};
    BiConsumer<int[], Dimensions> accumulator = (array, dim) -> {
        array[0] = dim.startX > array[0] ? dim.startX : array[0];
        array[1] = dim.startY > array[1] ? dim.startY : array[1];
        array[2] = dim.endX < array[2] ? dim.endX : array[2];
        array[3] = dim.endY < array[3] ? dim.endY : array[3];
    };
    BiConsumer<int[], int[]> combiner = (array1, array2) -> {
        array1[0] = array1[0] > array2[0] ? array1[0] : array2[0];
        array1[1] = array1[1] > array2[1] ? array1[1] : array2[1];
        array1[2] = array1[2] < array2[2] ? array1[2] : array2[2];
        array1[3] = array1[3] < array2[3] ? array1[3] : array2[3];
    };

    int[] searchDimensions = Arrays.stream(dimensions).collect(supplier, accumulator, combiner);

    return applyInBetween(inputArray, function, searchDimensions[0], searchDimensions[1], searchDimensions[2], searchDimensions[3]);
}

How about a custom collector that would collect elements to an array of dimension 4: 如何将自定义收集器收集到维度4的数组:

static class DimensionsMinMaxCollector implements Collector<Dimensions, int[], int[]> {

    @Override
    public BiConsumer<int[], Dimensions> accumulator() {
        return (array, dim) -> {
            array[0] = dim.startX > array[0] ? dim.startX : array[0];
            array[1] = dim.startY > array[1] ? dim.startY : array[1];
            array[2] = dim.endX > array[2] ? dim.endX : array[2];
            array[3] = dim.endY > array[3] ? dim.endY : array[3];
        };
    }

    @Override
    public Set<Characteristics> characteristics() {
        return EnumSet.of(Characteristics.IDENTITY_FINISH);
    }

    // TODO this looks like is not an identity for negative values
    @Override
    public BinaryOperator<int[]> combiner() {
        return (left, right) -> {
            for (int i = 0; i < 4; i++) {
                left[i] = left[i] > right[i] ? left[i] : right[i];
            }
            return left;
        };
    }

    @Override
    public Function<int[], int[]> finisher() {
        return Function.identity();
    }

    @Override
    public Supplier<int[]> supplier() {
        return () -> new int[4];
    }

}

If the intended result value is the same property that you are comparing, there is no need to use custom comparators, just map to the property before getting the minimum resp. 如果预期结果值与您要比较的属性相同,则无需使用自定义比较器,只需获得最小resp 之前映射到属性。 maximum. 最大值。 This may have additional benefits in simplicity and efficiency, if the property has a primitive type: 如果属性具有原始类型,则在简单性和效率方面可能具有额外的好处:

<S, T> T performOnIntersections(
         Function<S, T> function, S[][] inputArray, Dimensions...dimensions) {

    int maxStartX = Arrays.stream(dimensions).mapToInt(d -> d.startX).max().getAsInt();
    int maxStartY = Arrays.stream(dimensions).mapToInt(d -> d.startY).max().getAsInt();
    int minEndX = Arrays.stream(dimensions).mapToInt(d -> d.endX).min().getAsInt();
    int minEndY = Arrays.stream(dimensions).mapToInt(d -> d.endY).min().getAsInt();

    return applyInBetween(inputArray, function, maxStartX, maxStartY, minEndX, minEndY);
}

Whether avoiding multiple iterations over an ordinary array has any benefit, is unclear. 是否避免在普通阵列上进行多次迭代有任何好处,目前还不清楚。 If you want to try it, you can use 如果你想尝试一下,你可以使用

<S, T> T performOnIntersections(
         Function<S, T> function, S[][] inputArray, Dimensions...dimensions){

    BiConsumer<Dimensions,Dimensions> join = (d1,d2) -> {
        d1.startX=Math.max(d1.startX, d2.startX);
        d1.startY=Math.max(d1.startY, d2.startY);
        d1.endX=Math.min(d1.endX, d2.endX);
        d1.endY=Math.min(d1.endY, d2.endY);
    };
    Dimensions d = Arrays.stream(dimensions).collect(
        () -> new Dimensions(Integer.MIN_VALUE,Integer.MIN_VALUE,
                             Integer.MAX_VALUE,Integer.MAX_VALUE),
        join, join);

    int maxStartX = d.startX;
    int maxStartY = d.startY;
    int minEndX = d.endX;
    int minEndY = d.endY;

    return applyInBetween(inputArray, function, maxStartX, maxStartY, minEndX, minEndY);
}

The key point is the join function which adjust its first argument to be the intersection of the two dimensions. 关键点是join函数,它将第一个参数调整为两个维度的交集。 This is called mutable reduction and avoids creating a new Dimensions instance on every evaluation. 这称为可变缩减,并避免在每次评估时创建新的Dimensions实例。 For this to work, the collect method needs a Supplier as its first argument, which produces a new instance in a neutral initial state, ie a Dimensions instance spanning the entire integer range. 为此, collect方法需要将Supplier作为其第一个参数,该参数生成中性初始状态的新实例,即跨越整个整数范围的Dimensions实例。 For this, I assumed that you have a constructor accepting initial startX , startY , endX , endY values. 为此,我假设你有一个构造函数接受初始startXstartYendXendY值。

A non-mutable reduction is also possible: 不可变的减少也是可能的:

<S, T> T performOnIntersections(
         Function<S, T> function, S[][] inputArray, Dimensions...dimensions){

    Dimensions d = Arrays.stream(dimensions)
        .reduce((d1,d2) -> new Dimensions(
            Math.max(d1.startX, d2.startX),
            Math.max(d1.startY, d2.startY),
            Math.min(d1.endX, d2.endX),
            Math.min(d1.endY, d2.endY)))
        .get();

    int maxStartX = d.startX;
    int maxStartY = d.startY;
    int minEndX = d.endX;
    int minEndY = d.endY;

    return applyInBetween(inputArray, function, maxStartX, maxStartY, minEndX, minEndY);
}

For smaller arrays, this might be even more efficient (for the special case of a single element array, it will just return the element). 对于较小的数组,这可能更有效(对于单个元素数组的特殊情况,它将只返回元素)。 This would also work with an immutable version of Dimensions . 这也适用于Dimensions的不可变版本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM