使用 java stream 查找字符串中每个字符的计数

Question

we have this string: String input1 = "abbccd";我们有这个字符串： String input1 = "abbccd";

expected output: ab2c2d (note: if count=1, it shouldn't show in output).预期 output： ab2c2d （注意：如果 count=1，它不应该显示在输出中）。

the following code outputs a1,b2 c2 d2 on separate lines.以下代码在单独的行上输出a1,b2 c2 d2 。 Any suggestion to fix and improve?有什么修复和改进的建议吗？

input1.chars()
      .mapToObj(s -> Character.toLowerCase(Character.valueOf((char) s)))
      .collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
      .entrySet().stream()
      .forEach(n -> {System.out.println(n.getKey()+""+n.getValue());});

Answer 1

Make the last forEach a map instead.将最后一个forEach改为map 。

Instead of n.getValue() only add that part if n.getValue is not 1.如果n.getValue不是 1，则仅添加该部分，而不是n.getValue() 。

Then collect by joining.然后通过加入来收集。

At that point you will have a string you can print.那时你将有一个可以打印的字符串。

So, assuming we don't want to change your first part:因此，假设我们不想更改您的第一部分：

"abbccd".chars()
        .mapToObj(s -> Character.toLowerCase((char)s)) // notice here Character.valueOf was redundant, we're already dealing with a char
        .collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
        .entrySet().stream()
        .map(n -> n.getKey()+""+(n.getValue() == 1 ? "" : n.getValue()))
        .collect(Collectors.joining());

Results in ab2c2d .结果为ab2c2d 。

Answer 2

Unfortunately, the other two Answers both fail with most characters.不幸的是，其他两个答案都因大多数字符而失败。

Avoid legacy type `char`避免遗留类型`char`

The char type is legacy, essentially broken since Java 2, legacy since Java 5. As a 16-bit value, char is physically incapable of representing most of the 144,697 characters defined in Unicode . char类型是遗留类型，本质上是因为Java 2，自ZD52387880E1EA2281728172D3759213819Z 5.遗留下来是16-BIT的价值，是char的范围。

See one Answer's code break:查看一个答案的代码中断：

String input = "😷😷abbccd";
String output =
        input
                .chars()
                .mapToObj( s -> Character.toLowerCase( ( char ) s ) ) // notice here Character.valueOf was redundant, we're already dealing with a char
                .collect( Collectors.groupingBy( Function.identity() , LinkedHashMap :: new , Collectors.counting() ) )
                .entrySet().stream()
                .map( n -> n.getKey() + "" + ( n.getValue() == 1 ? "" : n.getValue() ) )
                .collect( Collectors.joining() );

System.out.println( "output = " + output );

output =?2?2ab2c2d output =?2?2ab2c2d

Code point码点

Use code point integer numbers instead, when working with individual characters.在处理单个字符时，请改用代码点integer 数字。 A code point is the number permanently assigned to each character in Unicode.代码点是永久分配给 Unicode 中每个字符的数字。 They range from zero to just over a million.它们的范围从零到刚刚超过一百万。

You will find code point related method scattered around the Java classes.您会发现代码点相关的方法散布在 Java 类周围。 These include String , StringBuilder , Character , etc.其中包括String 、 StringBuilder 、 Character等。

The String#codePoints method returns an IntStream of code points, the code point number for each character in the string. String#codePoints方法返回代码点的IntStream ，即字符串中每个字符的代码点编号。

Here is a re-worked version of the clever code from Answer by Federico klez Culloca .这是Federico klez Culloca 的 Answer中巧妙代码的重新处理版本。 Cudos to him, as I would not have come up with that approach.向他表示敬意，因为我不会想出这种方法。

String input = "😷😷abbccd";
String output =
        input
                .codePoints()
                .map( Character :: toLowerCase )
                .mapToObj( codePoint -> Character.toString( codePoint ) )
                .collect( Collectors.groupingBy( Function.identity() , LinkedHashMap :: new , Collectors.counting() ) )
                .entrySet().stream()
                .map( n -> n.getKey() + "" + ( n.getValue() == 1 ? "" : n.getValue() ) )
                .collect( Collectors.joining() );
System.out.println( "output = " + output );

output = 2ab2c2d output = 2ab2c2d

Answer 3

For the sake of variety, a different approach using regex:为了多样化，使用正则表达式的不同方法：

String input1 = "abbccd";

String output = Pattern.compile("(?<=(.))(?!\\1)")
                       .splitAsStream(input1)
                       .map(str -> str.length() == 1 ? str : "" + str.charAt(0) + str.length())
                       .collect(Collectors.joining());

System.out.println(output);

使用 java stream 查找字符串中每个字符的计数

问题描述

2 个解决方案

解决方案1
2 2022-08-30 18:59:14

解决方案2
1 2022-08-30 23:59:18

Avoid legacy type `char`避免遗留类型`char`

Code point码点

解决方案3
0 2022-08-30 19:13:31

使用 java stream 查找字符串中每个字符的计数

问题描述

2 个解决方案

解决方案1 2 2022-08-30 18:59:14

解决方案2 1 2022-08-30 23:59:18

Avoid legacy type char避免遗留类型char

Code point码点

解决方案3 0 2022-08-30 19:13:31

解决方案1
2 2022-08-30 18:59:14

解决方案2
1 2022-08-30 23:59:18

Avoid legacy type `char`避免遗留类型`char`

解决方案3
0 2022-08-30 19:13:31