简体   繁体   中英

How can I convert a Stream of Strings to Stream of String pairs?

I want to take a stream of strings and turn it into a stream of word pairs. eg:

I have: { "A", "Apple", "B", "Banana", "C", "Carrot" }

I want: { ("A", "Apple"), ("Apple", "B"), ("B", "Banana"), ("Banana", "C") } .

This is nearly the same as Zipping, as outlined at Zipping streams using JDK8 with lambda (java.util.stream.Streams.zip)

However, that produces: { (A, Apple), (B, Banana), (C, Carrot) }

The following code works, but is clearly the wrong way to do it (not thread safe etc etc):

static String buffered = null;

static void output(String s) {
    String result = null;
    if (buffered != null) {
        result = buffered + "," + s;
    } else {
        result = null;
    }

    buffered = s;
    System.out.println(result);
}

// ***** 

Stream<String> testing = Stream.of("A", "Apple", "B", "Banana", "C", "Carrot");
testing.forEach(s -> {output(s);});

This should do what you want, based on @njzk2's comment of using the stream twice, skipping the first element in the second case. It uses the zip method that you link in your original question.

public static void main(String[] args) {
  List<String> input = Arrays.asList("A", "Apple", "B", "Banana", "C", "Carrot");
  List<List<String>> paired = zip(input.stream(),
                                  input.stream().skip(1),
                                  (a, b) -> Arrays.asList(a, b))
                              .collect(ArrayList::new, ArrayList::add, ArrayList::addAll);
  System.out.println(paired);
}

This outputs a List<List<String>> with contents:

[[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]

In the comments, you asked how to do this if you already have a Stream . Unfortunately, it's difficult, because Streams are not stateful, and there isn't really a concept of the "adjacent" element in the Stream . There is a good discussion on this here .

I can think of two ways to do it, but I don't think you're going to like either of them:

  1. Convert the Stream to a List , and then do my solution above. Ugly, but works as long as the Stream isn't infinite and performance doesn't matter very much.
  2. Use @TagirValeev's answer below , as long as you are using a StreamEx and not a Stream , and willing to add a dependency on a third party library.

Also relevant to this discussion is this question here: Can I duplicate a Stream in Java 8? ; it's not good news for your problem, but is worth reading and may have a solution that's more appealing to you.

If you:

  1. Don't like the idea of creating a list with all strings from your stream
  2. Don't want to use external libraries
  3. Like to get your hands dirty

Then you can create a method to group elements from a stream using Java 8 low-level stream builders StreamSupport and Spliterator :

class StreamUtils {
    public static<T> Stream<List<T>> sliding(int size, Stream<T> stream) {
        return sliding(size, 1, stream);
    }

    public static<T> Stream<List<T>> sliding(int size, int step, Stream<T> stream) {
        Spliterator<T> spliterator = stream.spliterator();
        long estimateSize;

        if (!spliterator.hasCharacteristics(Spliterator.SIZED)) {
            estimateSize = Long.MAX_VALUE;
        } else if (size > spliterator.estimateSize()) {
            estimateSize = 0;
        } else {
            estimateSize = (spliterator.estimateSize() - size) / step + 1;
        }

        return StreamSupport.stream(
                new Spliterators.AbstractSpliterator<List<T>>(estimateSize, spliterator.characteristics()) {
                    List<T> buffer = new ArrayList<>(size);

                    @Override
                    public boolean tryAdvance(Consumer<? super List<T>> consumer) {
                        while (buffer.size() < size && spliterator.tryAdvance(buffer::add)) {
                            // Nothing to do
                        }

                        if (buffer.size() == size) {
                            List<T> keep = new ArrayList<>(buffer.subList(step, size));
                            consumer.accept(buffer);
                            buffer = keep;
                            return true;
                        }
                        return false;
                    }
                }, stream.isParallel());
    }
}

Methods and parameters naming was inspired in their Scala counterparts.

Let's test it:

Stream<String> testing = Stream.of("A", "Apple", "B", "Banana", "C", "Carrot");
System.out.println(StreamUtils.sliding(2, testing).collect(Collectors.toList()));

[[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]

What about not repeating elements:

Stream<String> testing = Stream.of("A", "Apple", "B", "Banana", "C", "Carrot");
System.out.println(StreamUtils.sliding(2, 2, testing).collect(Collectors.toList()));

[[A, Apple], [B, Banana], [C, Carrot]]

And now with an infinite Stream :

StreamUtils.sliding(5, Stream.iterate(0, n -> n + 1))
        .limit(5)
        .forEach(System.out::println);

[0, 1, 2, 3, 4]
[1, 2, 3, 4, 5]
[2, 3, 4, 5, 6]
[3, 4, 5, 6, 7]
[4, 5, 6, 7, 8]

You can use my StreamEx library which enhances standard Stream API. There is a method pairMap which does exactly what you need:

StreamEx.of("A", "Apple", "B", "Banana", "C", "Carrot")
        .pairMap((a, b) -> a+","+b)
        .forEach(System.out::println);

Output:

A,Apple
Apple,B
B,Banana
Banana,C
C,Carrot

The pairMap argument is the function which converts the pair of adjacent elements to something which is suitable to your needs. If you have a Pair class in your project, you can use .pairMap(Pair::new) to get the stream of pairs. If you want to create a stream of two-element lists, you can use:

List<List<String>> list = StreamEx.of("A", "Apple", "B", "Banana", "C", "Carrot")
                                    .pairMap((a, b) -> StreamEx.of(a, b).toList())
                                    .toList();
System.out.println(list); // [[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]

This works with any element source (you can use StreamEx.of(collection) , StreamEx.of(stream) and so on), correctly works if you have more stream operations before pairMap and very friendly to parallel processing (unlike solutions which involve stream zipping).

In case if your input is a List with fast random access and you actually want List<List<String>> as a result, there's a shorter and somewhat different way to achieve this in my library using ofSubLists :

List<String> input = Arrays.asList("A", "Apple", "B", "Banana", "C", "Carrot");
List<List<String>> list = StreamEx.ofSubLists(input, 2, 1).toList();
System.out.println(list); // [[A, Apple], [Apple, B], [B, Banana], [Banana, C], [C, Carrot]]

Here behind the scenes input.subList(i, i+2) is called for each input list position, so your data is not copied to the new lists, but subLists are created which refer to the original list.

Here's a minimal amount of code that creates a List<List<String>> of the pairs:

List<List<String>> pairs = new LinkedList<>();
testing.reduce((a, b)-> {pairs.add(Arrays.asList(a,b)); return b;});

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM