简体   繁体   English

使用 java stream 查找字符串中每个字符的计数

[英]use java stream to find count of each character in a string

we have this string: String input1 = "abbccd";我们有这个字符串: String input1 = "abbccd";

expected output: ab2c2d (note: if count=1, it shouldn't show in output).预期 output: ab2c2d (注意:如果 count=1,它不应该显示在输出中)。

the following code outputs a1,b2 c2 d2 on separate lines.以下代码在单独的行上输出a1,b2 c2 d2 Any suggestion to fix and improve?有什么修复和改进的建议吗?

input1.chars()
      .mapToObj(s -> Character.toLowerCase(Character.valueOf((char) s)))
      .collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
      .entrySet().stream()
      .forEach(n -> {System.out.println(n.getKey()+""+n.getValue());});

Make the last forEach a map instead.将最后一个forEach改为map

Instead of n.getValue() only add that part if n.getValue is not 1.如果n.getValue不是 1,则仅添加该部分,而不是n.getValue()

Then collect by joining.然后通过加入来收集。

At that point you will have a string you can print.那时你将有一个可以打印的字符串。

So, assuming we don't want to change your first part:因此,假设我们不想更改您的第一部分:

"abbccd".chars()
        .mapToObj(s -> Character.toLowerCase((char)s)) // notice here Character.valueOf was redundant, we're already dealing with a char
        .collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
        .entrySet().stream()
        .map(n -> n.getKey()+""+(n.getValue() == 1 ? "" : n.getValue()))
        .collect(Collectors.joining());

Results in ab2c2d .结果为ab2c2d

Unfortunately, the other two Answers both fail with most characters.不幸的是,其他两个答案都因大多数字符而失败。

Avoid legacy type char避免遗留类型char

The char type is legacy, essentially broken since Java 2, legacy since Java 5. As a 16-bit value, char is physically incapable of representing most of the 144,697 characters defined in Unicode . char类型是遗留类型,本质上是因为Java 2,自ZD52387880E1EA2281728172D3759213819Z 5.遗留下来是16-BIT的价值,是char范围

See one Answer's code break:查看一个答案的代码中断:

String input = "😷😷abbccd";
String output =
        input
                .chars()
                .mapToObj( s -> Character.toLowerCase( ( char ) s ) ) // notice here Character.valueOf was redundant, we're already dealing with a char
                .collect( Collectors.groupingBy( Function.identity() , LinkedHashMap :: new , Collectors.counting() ) )
                .entrySet().stream()
                .map( n -> n.getKey() + "" + ( n.getValue() == 1 ? "" : n.getValue() ) )
                .collect( Collectors.joining() );

System.out.println( "output = " + output );

output =?2?2ab2c2d output =?2?2ab2c2d

Code point码点

Use code point integer numbers instead, when working with individual characters.在处理单个字符时,请改用代码点integer 数字。 A code point is the number permanently assigned to each character in Unicode.代码点是永久分配给 Unicode 中每个字符的数字。 They range from zero to just over a million.它们的范围从零到刚刚超过一百万。

You will find code point related method scattered around the Java classes.您会发现代码点相关的方法散布在 Java 类周围。 These include String , StringBuilder , Character , etc.其中包括StringStringBuilderCharacter等。

The String#codePoints method returns an IntStream of code points, the code point number for each character in the string. String#codePoints方法返回代码点的IntStream ,即字符串中每个字符的代码点编号。

Here is a re-worked version of the clever code from Answer by Federico klez Culloca .这是Federico klez Culloca 的 Answer中巧妙代码的重新处理版本。 Cudos to him, as I would not have come up with that approach.向他表示敬意,因为我不会想出这种方法。

String input = "😷😷abbccd";
String output =
        input
                .codePoints()
                .map( Character :: toLowerCase )
                .mapToObj( codePoint -> Character.toString( codePoint ) )
                .collect( Collectors.groupingBy( Function.identity() , LinkedHashMap :: new , Collectors.counting() ) )
                .entrySet().stream()
                .map( n -> n.getKey() + "" + ( n.getValue() == 1 ? "" : n.getValue() ) )
                .collect( Collectors.joining() );
System.out.println( "output = " + output );

output = 2ab2c2d output = 2ab2c2d

For the sake of variety, a different approach using regex:为了多样化,使用正则表达式的不同方法:

String input1 = "abbccd";

String output = Pattern.compile("(?<=(.))(?!\\1)")
                       .splitAsStream(input1)
                       .map(str -> str.length() == 1 ? str : "" + str.charAt(0) + str.length())
                       .collect(Collectors.joining());

System.out.println(output);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM