如何使用Java 8 Streams過濾具有相同ID的對象的最大出現次數

Question

我需要找出獲得最多評論的newsId 。

我創建了News類，在其中我創建了方法變量、構造函數、getter 和 setter。

我創建了Main類來使用流在 java 8 中編寫邏輯。

我對實現Predicate接口以過濾掉在News對象列表中具有最大計數的newsId感到震驚。

public class News {
    int newsId;
    String postByUser;
    String commentByUser;
    String comment;

    public News(int newsId, String postByUser, String commentByUser, String comment) {
        this.newsId = newsId;
        this.postByUser = postByUser;
        this.commentByUser = commentByUser;
        this.comment = comment;
    }

    public int getNewsId() {
        return newsId;
    }

    public void setNewsId(int newsId) {
        this.newsId = newsId;
    }

    public String getPostByUser() {
        return postByUser;
    }

    public void setPostByUser(String postByUser) {
        this.postByUser = postByUser;
    }

    public String getCommentByUser() {
        return commentByUser;
    }

    public void setCommentByUser(String commentByUser) {
        this.commentByUser = commentByUser;
    }

    public String getComment() {
        return comment;
    }

    public void setComment(String comment) {
        this.comment = comment;
    }
}

class Main {

    static List < News > news = Arrays.asList(
        new News(1, "fb_Userpost", "fb_Usercomment", "comment1"),
        new News(2, "insta_userpost", "insta_usercomment", "comment2"),
        new News(1, "whatsapp_userpost", "whatsapp_usercomment", "comment3"),
        new News(1, "whatsapp_userpost", "whatsapp_usercomment", "comment3"),
        new News(3, "whatsapp_userpost", "whatsapp_usercomment", "comment3")
    );
    public static void main(String args[]) {
        //   Predicate<News> pr = s -> s
        news.stream()
            .filter(pr)
            .collect(Collectors.toList())
            .forEach(s - > System.out.println(s.getNewsId()));
    }

Answer 1

結果 - 單個 newsId

我需要找出獲得最多評論的newsId 。

您無法通過單獨使用filter()來實現它。 並且根本沒有必要為此使用filter()操作。

要找到最頻繁的新聞，您需要積累數據。 方法collect()應該對此負責，而不是filter() 。

最明顯的選擇是創建一個中間Map ，其中包含每個newsId的計數。 為此，您可以使用收集器groupingBy()和counting()的組合。

然后，您可以在映射條目上創建一個流，並使用max()作為終端操作選擇具有最高值的條目。

public static void main(String args[]) {
    
    news.stream()
        .collect(Collectors.groupingBy( // creating an intermediate Map<Integer, Long>
            News::getNewsId,            // map's key
            Collectors.counting()       // value
        ))
        .entrySet().stream()               // creating a stream over the map's entries
        .max(Map.Entry.comparingByValue()) // picking the entry with the highest value -> result: Optional<Map.Entry<Integer, Long>>
        .map(Map.Entry::getKey)            // transforming the optional result Optional<Integer> 
        .ifPresent(System.out::println);   // printing the result if optional is not empty
}

使用您的示例數據，此代碼將生成輸出1 。

結果 - 頻率最高的 newsId 列表

為了解決可能有多個newsId出現次數相同的情況，您可以構建 *a custom collector 。

最初的想法與上面描述的相同，但是取而代之的是max()操作，這次我們將collect()應用於映射條目流，並且將提供自定義收集器作為參數。

要創建自定義收集器，我們可以使用靜態方法Collector.of() 。

下面提供的自定義收集器背后的邏輯如下：

supplier中間結果（映射條目）存儲在Queue中。
accumulator - 如果下一個流元素與隊列中的第一個元素具有相同的頻率計數（映射條目的值），或者隊列為空，則將其添加到隊列中。 如果下一個元素的計數較低，它將被忽略。 如果計數較高，則隊列被清理，並添加下一個元素。
combiner並行執行流時獲得的兩個隊列將使用與上述accumulator幾乎相同的邏輯進行組合。
finisher - 此函數將映射條目隊列轉換為newsId列表。

請注意，這樣的實現只允許對條目集進行一次迭代，這種性能優勢是其復雜性的理由。

public static void main(String args[]) {
    
    news.stream()
        .collect(Collectors.groupingBy(
            News::getNewsId,
            Collectors.counting()
        ))
        .entrySet().stream()
        .collect(Collector.of(
            ArrayDeque::new,
            (Queue<Map.Entry<Integer, Long>> queue, Map.Entry<Integer, Long> entry) -> {
                if (queue.isEmpty() || queue.element().getValue().equals(entry.getValue())) {
                    queue.add(entry);
                } else if (queue.element().getValue() < entry.getValue()) {
                    queue.clear();
                    queue.add(entry);
                }
            },
            (left, right) -> {
                if (left.isEmpty() || !right.isEmpty()
                    && right.element().getValue() > left.element().getValue())
                    return right;
                if (right.isEmpty() || left.element().getValue() > right.element().getValue())
                    return left;
                
                left.addAll(right);
                return left;
            },
            queue -> queue.stream().map(Map.Entry::getKey).collect(Collectors.toList())
        ))
        .forEach(System.out::println);
}

static List<News> news = Arrays.asList( // News `1` & `2` are the most frequent
    new News(1, "fb_Userpost", "fb_Usercomment", "comment1"),
    new News(2, "insta_userpost", "insta_usercomment", "comment2"),
    new News(2, "insta_userpost", "insta_usercomment", "comment2"),
    new News(2, "insta_userpost", "insta_usercomment", "comment2"),
    new News(1, "whatsapp_userpost", "whatsapp_usercomment", "comment3"),
    new News(1, "whatsapp_userpost", "whatsapp_usercomment", "comment3"),
    new News(3, "whatsapp_userpost", "whatsapp_usercomment", "comment3")
);

輸出：

1
2

Answer 2

首先，統計每個newsId被引用的次數。 然后，找到最大計數。 最后，只保留那些具有最大計數的標識符。

Map<Integer, Long> countByNewsId = news.stream()
    .map(News::getNewsId)
    .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
Long max = countByNewsId.values().stream().max(Long::compareTo).orElse(null);
countByNewsId.values().removeIf(Predicate.isEqual(max).negate());
Set<Integer> maxCommentedNewsIds = countByNewsId.keySet();

Answer 3

這是另一種僅使用單個流通過的解決方案，盡管在速度方面它可能比 Alexander 的解決方案稍慢（我沒有對其進行基准測試），但它要短得多。

import static java.util.function.Function.identity;
import static java.util.stream.Collectors.*;

news.stream()
        .map(News::getNewsId)
        .collect(
                teeing(
                        groupingBy(identity(), counting()),
                        collectingAndThen(groupingBy(identity(), counting()), map -> Collections.max(map.values())),
                        (frequencyMap, max) -> {
                            frequencyMap.values().removeIf(v -> v != max.longValue());
                            return frequencyMap.keySet();
                        }
                )
        ).forEach(System.out::println);

如何使用Java 8 Streams過濾具有相同ID的對象的最大出現次數

問題描述

3 個解決方案

解決方案1
2 2022-06-14 17:49:52

結果 - 單個 newsId

結果 - 頻率最高的 newsId 列表

解決方案2
0 2022-06-14 21:37:47

解決方案3
0 2022-06-15 09:16:29

如何使用Java 8 Streams過濾具有相同ID的對象的最大出現次數

問題描述

3 個解決方案

解決方案1 2 2022-06-14 17:49:52

結果 - 單個 newsId

結果 - 頻率最高的 newsId 列表

解決方案2 0 2022-06-14 21:37:47

解決方案3 0 2022-06-15 09:16:29

解決方案1
2 2022-06-14 17:49:52

解決方案2
0 2022-06-14 21:37:47

解決方案3
0 2022-06-15 09:16:29