简体   繁体   English

使用Java中的流拆分单词列表

[英]Split list of words using streams in Java

I am having this method that takes a number of lists, which contain lines of books. 我有这个方法需要许多列表,其中包含书籍行。 I am combing them to a stream to then iterate over them to split on all non-letter's \\\\P{L} . 我正在将它们组合成一个流然后迭代它们以拆分所有非字母的\\\\P{L}

Is there a way to avoid the for-each loop and process this within a stream? 有没有办法避免for-each循环并在流中处理它?

private List<String> getWordList(List<String>... lists) {
        List<String> wordList = new ArrayList<>();

        Stream<String> combinedStream = Stream.of(lists)
                .flatMap(Collection::stream);
        List<String> combinedLists = combinedStream.collect(Collectors.toList());

        for (String line: combinedLists) {
            wordList.addAll(Arrays.asList(line.split("\\P{L}")));
        }

        return wordList;
}

Having stream, you can simply " flatMap " further and return the result: 有了流,你可以简单地“ flatMap ”并返回结果:

return combinedStream
        .flatMap(str -> Arrays.stream(str.split("\\P{L}")))
        .collect(Collectors.toList());

To put it altogether: 完全说出来:

private List<String> getWordList(List<String>... lists) {
    return Stream.of(lists)
        .flatMap(Collection::stream)
        .flatMap(str -> Arrays.stream(str.split("\\P{L}")))
        .collect(Collectors.toList());
}

You don't need to introduce so many variables : 您不需要引入这么多变量:

private List<String> getWordList(List<String>... lists) {

    return Stream.of(lists) // Stream<Stream<String>>
                 .flatMap(Collection::stream) // Stream<String> 
                 .flatMap(Pattern.compile("\\P{L}")::splitAsStream) //Stream<String>     
                 .collect(toList()); // List<String>
}

As underlined by Holger, .flatMap(Pattern.compile("\\\\P{L}")::splitAsStream) 如Holger所示, .flatMap(Pattern.compile("\\\\P{L}")::splitAsStream)
should be favored over .flatMap(s -> Arrays.stream(s.split("\\\\P{L}"))) to spare array allocation and pattern compilation performed for each element of the stream. 应该优先于.flatMap(s -> Arrays.stream(s.split("\\\\P{L}")))来为流的每个元素执行备用数组分配和模式编译。

You can combine all the list and flatMap for result 您可以将结果的所有列表和flatMap组合在一起

private List<String> getWordList(List<String>... lists) {
    return Stream.of(lists)
    .flatMap(Collection::stream)
    .flatMap(str -> Arrays.stream(str.split("\\P{L}")))
    .collect(Collectors.toList());
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM