简体   繁体   English

Java将字符串集合减少到出现的映射

[英]Java reduce a collection of string to a map of occurence

Consider the a list as id1_f, id2_d, id3_f, id1_g , how can I use stream to get a reduced map in format of <String, Integer> of statistics like: 将列表视为id1_f, id2_d, id3_f, id1_g ,如何使用流来获取统计信息的<String, Integer>格式的简化映射,如:

id1 2
id2 1
id3 1

Note: the key is part before _ . 注意:键在_之前是部分。 Is reduce function can help here? reduce功能可以帮助吗?

This will get the job done: 这将完成工作:

Map<String, Long> map = Stream.of("id1_f", "id2_d", "id3_f", "id1_g")
  .collect(
    Collectors.groupingBy(v -> v.split("_")[0],
    Collectors.counting())
  );

You can also use the toMap collector: 您还可以使用toMap收集器:

myList.stream()
      .collect(Collectors.toMap((String s) -> s.split("_")[0], 
                   (String s) -> 1, Math::addExact);

if you care about the order of the elements then dump the result into a LinkedHashMap . 如果您关心元素的顺序,则将结果转储到LinkedHashMap

myList.stream()
      .collect(Collectors.toMap((String s) -> s.split("_")[0], 
                   (String s) -> 1, Math::addExact, 
                     LinkedHashMap::new));

A non-stream approach using Map::merge : 使用Map :: merge的非流方法:

Map<String, Integer> result = new LinkedHashMap<>();
myList.forEach(s -> result.merge(s.split("_")[0], 1, Math::addExact));

Since you want to count the elements, I'd suggest using Guava 's Multiset interface, which is dedicated to such purpose. 由于你想要计算元素,我建议使用GuavaMultiset接口,它专门用于此目的。

The definition of Multiset from its JavaDoc: 从JavaDoc中定义Multiset

A collection that supports order-independent equality, like Set , but may have duplicate elements. 支持与顺序无关的相等的集合,如Set ,但可能具有重复的元素。 A multiset is also sometimes called a bag . multiset有时也被称为

Elements of a multiset that are equal to one another are referred to as occurrences of the same single element. 多重集是彼此相等的元件被称为相同的单个元件的出现 The total number of occurrences of an element in a multiset is called the count of that element. 多集中元素的出现总数称为该元素的计数

Here are two ways to use it: 以下是两种使用方法:

1) Without the Stream API: 1)没有Stream API:

ImmutableMultiset<String> multiset2 = ImmutableMultiset.copyOf(Lists.transform(
        list, str -> StringUtils.substringBefore(str, "_")
));

2) Using the Stream API: 2)使用Stream API:

ImmutableMultiset<String> multiset = list.stream()
        .map(str -> StringUtils.substringBefore(str, "_"))
        .collect(ImmutableMultiset.toImmutableMultiset());

Note that instead of using something like s.split("_")[0] , I used Apache Commons Lang 's StringUtils.substringBefore , which I find much more readable. 请注意,我没有使用类似s.split("_")[0] ,而是使用了Apache Commons LangStringUtils.substringBefore ,我发现它更具可读性。

You retrieve the counts of the elements using Multiset.count() method. 您可以使用Multiset.count()方法检索元素的计数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM