简体   繁体   English

Java 的 Stream.flatMap() 的逆操作是什么?

[英]What is the (kind of) inverse operation to Java's Stream.flatMap()?

The Stream.flatMap() operation transforms a stream of Stream.flatMap()操作将 stream 转换为

a, b, c

into a stream that contains zero or more elements for each input element, eg到 stream 中,每个输入元素包含零个或多个元素,例如

a1, a2, c1, c2, c3

Is there the opposite operations that batches up a few elements into one new one?是否有相反的操作将几个元素组合成一个新元素?

  • It is not.reduce(), because this produces only one result不是.reduce(),因为这只会产生一个结果
  • It is not collect(), because this only fills a container (afaiu)它不是collect(),因为这只填充了一个容器(afaiu)
  • It is not forEach(), because this has returns just void and works with side effects它不是 forEach(),因为它只返回void并且具有副作用

Does it exist?它存在吗? can I simulate it in any way?我可以以任何方式模拟它吗?

Finally I figured out that flatMap is its own "inverse" so to say. 最终我发现flatMap是它自己的“逆”,可以这么说。 I oversaw that flatMap not necessarily increases the number of elements. 我监督了flatMap不一定会增加元素数量。 It may also decrease the number of elements by emitting an empty stream for some of the elements. 通过为某些元素发出空流,它也可以减少元素的数量。 To implement a group-by operation, the function called by flatMap needs minimal internal state, namely the most recent element. 为了实现分组操作, flatMap调用的函数需要最小的内部状态,即最近的元素。 It either returns an empty stream or, at the end of a group, it returns the reduced-to group representative. 它要么返回空流,要么在组的末尾返回简化的组代表。

Here is a quick implementation where groupBorder must return true if the two elements passed in do not belong to the same group, ie between them is the group border. 这是一个快速实现,如果传入的两个元素不属于同一组,即它们之间是组边界,则groupBorder必须返回true The combiner is the group function that combines, for example (1,a), (1,a), (1,a) into (3,a), given that your group elements are, tuples (int, string). combiner是一种组合函数,例如将(1,a),(1,a),(1,a)组合成(3,a),前提是您的组合元素是元组(int,string)。

public class GroupBy<X> implements Function<X, Stream<X>>{

  private final BiPredicate<X, X> groupBorder;
  private final BinaryOperator<X> combiner;
  private X latest = null;

  public GroupBy(BiPredicate <X, X> groupBorder,
                 BinaryOperator<X> combiner) {
    this.groupBorder = groupBorder;
    this.combiner = combiner;
  }

  @Override
  public Stream<X> apply(X elem) {
    // TODO: add test on end marker as additonal parameter for constructor
    if (elem==null) {
      return latest==null ? Stream.empty() : Stream.of(latest);
    }
    if (latest==null) {
      latest = elem;
      return Stream.empty();
    }
    if (groupBorder.test(latest, elem)) {
      Stream<X> result = Stream.of(latest);
      latest = elem;
      return result;
    }
    latest = combiner.apply(latest,  elem);
    return Stream.empty();
  }
}

There is one caveat though: to ship the last group of the whole stream, an end marker must be stuck as the last element into the stream. 但是有一个警告:要发送整个流的最后一组,必须将结束标记作为流中的最后一个元素粘贴。 The above code assumes it is null , but an additional end-marker-tester could be added. 上面的代码假定它为null ,但是可以添加一个额外的结束标记测试器。

I could not come up with a solution that does not rely on the end marker. 我无法提出不依赖结束标记的解决方案。

Further I did not also convert between incoming and outgoing elements. 此外,我也没有在传入和传出元素之间进行转换。 For a unique-operation, this would just work. 对于唯一操作,这将起作用。 For a count-operation, a previous step would have to map individual elements to a counting object. 对于计数操作,上一步将必须将单个元素映射到计数对象。

You can hack your way around. 你能自己的方式左右。 See the following example: 请参见以下示例:

Stream<List<String>> stream = Stream.of("Cat", "Dog", "Whale", "Mouse")
   .collect(Collectors.collectingAndThen(
       Collectors.partitioningBy(a -> a.length() > 3),
       map -> Stream.of(map.get(true), map.get(false))
    ));
    IntStream.range(0, 10)
            .mapToObj(n -> IntStream.of(n, n / 2, n / 3))
            .reduce(IntStream.empty(), IntStream::concat)
            .forEach(System.out::println);

As you see elements are mapped to Streams too, and then concatenated into one large stream. 如您所见,元素也映射到Streams,然后串联成一个大流。

Take a look at collapse in StreamEx 看看StreamEx中的collapse

StreamEx.of("a1", "a2", "c1", "c2", "c3").collapse((a, b) -> a.charAt(0) == b.charAt(0))
    .map(e -> e.substring(0, 1)).forEach(System.out::println);

Or my fork with more function: groupBy , split , sliding ... 还是我的叉子更多的功能: groupBysplitsliding ...

StreamEx.of("a1", "a2", "c1", "c2", "c3").collapse((a, b) -> a.charAt(0) == b.charAt(0))
.map(e -> e.substring(0, 1)).forEach(System.out::println);
// a
// c

StreamEx.of("a1", "a2", "c1", "c2", "c3").splitToList(2).forEach(System.out::println);
// [a1, a2]
// [c1, c2]
// [c3]

StreamEx.of("a1", "a2", "c1", "c2", "c3").groupBy(e -> e.charAt(0))
.forEach(System.out::println);
// a=[a1, a2]
// c=[c1, c2, c3]

This is what I came up with: 这是我想出的:

interface OptionalBinaryOperator<T> extends BiFunction<T, T, Optional<T>> {
  static <T> OptionalBinaryOperator<T> of(BinaryOperator<T> binaryOperator,
          BiPredicate<T, T> biPredicate) {
    return (t1, t2) -> biPredicate.test(t1, t2)
            ? Optional.of(binaryOperator.apply(t1, t2))
            : Optional.empty();
  }
}

class StreamUtils {
  public static <T> Stream<T> reducePartially(Stream<T> stream,
          OptionalBinaryOperator<T> conditionalAccumulator) {
    Stream.Builder<T> builder = Stream.builder();
    stream.reduce((t1, t2) -> conditionalAccumulator.apply(t1, t2).orElseGet(() -> {
      builder.add(t1);
      return t2;
    })).ifPresent(builder::add);
    return builder.build();
  }
}

Unfortunately, I did not have the time to make it lazy, but it can be done by writing a custom Spliterator delegating to stream.spliterator() that would follow the logic above (instead of utilizing stream.reduce() , which is a terminal operation). 不幸的是,我没有时间让它变得懒惰,但是可以通过编写一个委托给Spliterator stream.spliterator()的自定义Spliterator来完成,该方法遵循上述逻辑(而不是使用stream.reduce() ,它是终端)操作)。


PS. PS。 I just realized you wanted <T,U> conversion, and I wrote about <T,T> conversion. 我只是意识到您想要<T,U>转换,所以我写了关于<T,T>转换。 If you can first map from T to U , and then use the function above, then that's it (even if it's suboptimal). 如果您可以先从T映射到U ,然后使用上面的函数,那么就可以了(即使它不是次优的)。

If it's something more complex, the kind of condition for reducing/merging would need to be defined before proposing an API (eg Predicate<T> , BiPredicate<T,T> , BiPredicate<U,T> , or maybe even Predicate<List<T>> ). 如果更复杂,则在提议API之前需要定义减少/合并的条件类型(例如Predicate<T>BiPredicate<T,T>BiPredicate<U,T>甚至Predicate<List<T>> )。

A bit like StreamEx, you could implement the Spliterator manually.有点像 StreamEx,你可以手动实现 Spliterator。 For example,例如,

collectByTwos(Stream.of(1, 2, 3, 4), (x, y) -> String.format("%d%d", x, y))

... returns a stream of "12", "34" using the code below: ...使用以下代码返回“12”、“34”的 stream:

public static <X,Y> Stream<Y> collectByTwos(Stream<X> inStream, BiFunction<X,X,Y> mapping) {
    Spliterator<X> origSpliterator = inStream.spliterator();
    Iterator<X> origIterator = Spliterators.iterator(origSpliterator);

    boolean isParallel = inStream.isParallel();
    long newSizeEst = (origSpliterator.estimateSize() + 1) / 2;

    Spliterators.AbstractSpliterator<Y> lCombinedSpliterator =
            new Spliterators.AbstractSpliterator<>(newSizeEst, origSpliterator.characteristics()) {
        @Override
        public boolean tryAdvance(Consumer<? super Y> action) {
            if (! origIterator.hasNext()) {
                return false;
            }
            X lNext1 = origIterator.next();
            if (! origIterator.hasNext()) {
                throw new IllegalArgumentException("Trailing elements of the stream would be ignored.");
            }
            X lNext2 = origIterator.next();
            action.accept(mapping.apply(lNext1, lNext2));
            return true;
        }
    };
    return StreamSupport.stream(lCombinedSpliterator, isParallel)
            .onClose(inStream::close);
}

(I think this may likely be incorrect for parallel streams.) (我认为这对于并行流可能不正确。)

Helped mostly by the StreamEx answer above by user_3380739 , you can use groupRuns docs here主要由 user_3380739 上面的StreamEx回答帮助,您可以在此处使用groupRuns 文档

StreamEx.of("a1", "a2", "c1", "c2", "c3").groupRuns( t, u -> t.charAt(0) == u.charAt(0) )
.forEach(System.out::println);

// a=[a1, a2]
// c=[c1, c2, c3]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM