简体   繁体   English

根据字符串长度修剪字符串

[英]Trim a string based on the string length

I want to trim a string if the length exceeds 10 characters.如果长度超过 10 个字符,我想修剪字符串。

Suppose if the string length is 12 ( String s="abcdafghijkl" ), then the new trimmed string will contain "abcdefgh.." .假设如果字符串长度为 12 ( String s="abcdafghijkl" ),则新修剪的字符串将包含"abcdefgh.."

How can I achieve this?我怎样才能做到这一点?

s = s.substring(0, Math.min(s.length(), 10));

Using Math.min like this avoids an exception in the case where the string is already shorter than 10 .在字符串已经短于10的情况下,像这样使用Math.min可以避免异常。


Notes:笔记:

  1. The above does real trimming.以上是真正的修剪。 If you actually want to replace the last characters with three dots if the string is too long, use Apache Commons StringUtils.abbreviate ;如果您真的想用三个点替换字符串太长的最后一个字符,请使用 Apache Commons StringUtils.abbreviate see @H6's solution .请参阅@H6 的解决方案 If you want to use the Unicode horizontal ellipsis character, see @Basil's solution .如果要使用 Unicode 水平省略号字符,请参阅@Basil 的解决方案

  2. For typical implementations of String , s.substring(0, s.length()) will return s rather than allocating a new String .对于String典型实现, s.substring(0, s.length())将返回s而不是分配新的String

  3. This may behave incorrectly 1 if your String contains Unicode codepoints outside of the BMP;如果您的字符串包含 BMP 之外的 Unicode 代码点,这可能会出现错误1 eg Emojis.例如表情符号。 For a (more complicated) solution that works correctly for all Unicode code-points, see @sibnick's solution .有关适用于所有 Unicode 代码点的(更复杂的)解决方案,请参阅@sibnick 的解决方案


1 - A Unicode codepoint that is not on plane 0 (the BMP) is represented as a "surrogate pair" (ie two char values) in the String . 1 - 不在平面 0(BMP)上的 Unicode 代码点在String表示为“代理对”(即两个char值)。 By ignoring this, we might trim the string to fewer than 10 code points, or (worse) truncate it in the middle of a surrogate pair.通过忽略这一点,我们可能会将字符串修剪到少于 10 个代码点,或者(更糟糕的是)在代理对的中间截断它。 On the other hand, String.length() is not a good measure of Unicode text length, so trimming based on that property may be the wrong thing to do.另一方面, String.length()不是 Unicode 文本长度的良好度量,因此基于该属性进行修剪可能是错误的做法。

StringUtils.abbreviate from Apache Commons Lang library could be your friend:来自Apache Commons Lang库的StringUtils.abbreviate可能是你的朋友:

StringUtils.abbreviate("abcdefg", 6) = "abc..."
StringUtils.abbreviate("abcdefg", 7) = "abcdefg"
StringUtils.abbreviate("abcdefg", 8) = "abcdefg"
StringUtils.abbreviate("abcdefg", 4) = "a..."

Commons Lang3 even allow to set a custom String as replacement marker. Commons Lang3甚至允许将自定义字符串设置为替换标记。 With this you can for example set a single character ellipsis.例如,您可以设置单个字符省略号。

StringUtils.abbreviate("abcdefg", "\u2026", 6) = "abcde…"

There is a Apache Commons StringUtils function which does this.有一个 Apache Commons StringUtils函数可以做到这一点。

s = StringUtils.left(s, 10)

If len characters are not available, or the String is null, the String will be returned without an exception.如果 len 字符不可用,或者 String 为 null,则 String 将无异常地返回。 An empty String is returned if len is negative.如果 len 为负,则返回空字符串。

StringUtils.left(null, ) = null StringUtils.left(null, ) = null
StringUtils.left( , -ve) = "" StringUtils.left( , -ve) = ""
StringUtils.left("", *) = "" StringUtils.left("", *) = ""
StringUtils.left("abc", 0) = "" StringUtils.left("abc", 0) = ""
StringUtils.left("abc", 2) = "ab" StringUtils.left("abc", 2) = "ab"
StringUtils.left("abc", 4) = "abc" StringUtils.left("abc", 4) = "abc"

StringUtils.Left JavaDocs StringUtils.Left JavaDocs

Courtesy:Steeve McCauley礼貌:史蒂夫·麦考利

As usual nobody cares about UTF-16 surrogate pairs.像往常一样,没有人关心 UTF-16 代理对。 See about them: What are the most common non-BMP Unicode characters in actual use?查看它们: 实际使用中最常见的非 BMP Unicode 字符有哪些? Even authors of org.apache.commons/commons-lang3甚至 org.apache.commons/commons-lang3 的作者

You can see difference between correct code and usual code in this sample:您可以在此示例中看到正确代码和普通代码之间的区别:

public static void main(String[] args) {
    //string with FACE WITH TEARS OF JOY symbol
    String s = "abcdafghi\uD83D\uDE02cdefg";
    int maxWidth = 10;
    System.out.println(s);
    //do not care about UTF-16 surrogate pairs
    System.out.println(s.substring(0, Math.min(s.length(), maxWidth)));
    //correctly process UTF-16 surrogate pairs
    if(s.length()>maxWidth){
        int correctedMaxWidth = (Character.isLowSurrogate(s.charAt(maxWidth)))&&maxWidth>0 ? maxWidth-1 : maxWidth;
        System.out.println(s.substring(0, Math.min(s.length(), correctedMaxWidth)));
    }
}

s = s.length() > 10 ? s.substring(0, 9) : s;

Or you can just use this method in case you don't have StringUtils on hand:或者,如果您手头没有 StringUtils,您可以使用此方法:

public static String abbreviateString(String input, int maxLength) {
    if (input.length() <= maxLength) 
        return input;
    else 
        return input.substring(0, maxLength-2) + "..";
}

以防万一您正在寻找一种方法来修剪和保留字符串的最后 10 个字符。

s = s.substring(Math.max(s.length(),10) - 10);

The question is asked on Java, but it was back in 2014.这个问题是在 Java 上提出的,但它是在 2014 年。
In case you use Kotlin now, it is as simple as:如果你现在使用 Kotlin,它很简单:

yourString.take(10)

Returns a string containing the first n characters from this string, or the entire string if this string is shorter.返回包含此字符串中前 n 个字符的字符串,如果此字符串较短,则返回整个字符串。

Documentation 文档

tl;dr tl;博士

You seem to be asking for an ellipsis ( ) character in the last place, when truncating.截断时,您似乎在最后一个地方要求使用省略号( ) 字符。 Here is a one-liner to manipulate your input string.这是一个用于操作输入字符串的单行代码。

String input = "abcdefghijkl";
String output = ( input.length () > 10 ) ? input.substring ( 0 , 10 - 1 ).concat ( "…" ) : input;

See this code run live at IdeOne.com.查看此代码在 IdeOne.com 上实时运行。

abcdefghi… abcdefghi…

Ternary operator三元运算符

We can make a one-liner by using the ternary operator .我们可以使用三元运算符来制作单线

String input = "abcdefghijkl" ;

String output = 
    ( input.length() > 10 )          // If too long…
    ?                                
    input     
    .substring( 0 , 10 - 1 )         // Take just the first part, adjusting by 1 to replace that last character with an ellipsis.
    .concat( "…" )                   // Add the ellipsis character.
    :                                // Or, if not too long…
    input                            // Just return original string.
;

See this code run live at IdeOne.com.查看此代码在 IdeOne.com 上实时运行。

abcdefghi… abcdefghi…

Java streams Java 流

The Java Streams facility makes this interesting, as of Java 9 and later.从 Java 9 及更高版本开始,Java Streams 工具使这变得有趣。 Interesting, but maybe not the best approach.有趣,但也许不是最好的方法。

We use code points rather than char values.我们使用代码点而不是char值。 The char type is legacy, and is limited to the a subset of all possible Unicode characters. char类型是遗留的,仅限于所有可能的Unicode字符的子集

String input = "abcdefghijkl" ;
int limit = 10 ;
String output =
        input
                .codePoints()
                .limit( limit )
                .collect(                                    // Collect the results of processing each code point.
                        StringBuilder::new,                  // Supplier<R> supplier
                        StringBuilder::appendCodePoint,      // ObjIntConsumer<R> accumulator
                        StringBuilder::append                // BiConsumer<R,​R> combiner
                )
                .toString()
        ;

If we had excess characters truncated, replace the last character with an ellipsis .如果我们截断了多余的字符,请用省略号替换最后一个字符。

if ( input.length () > limit )
{
    output = output.substring ( 0 , output.length () - 1 ) + "…";
}

If only I could think of a way to put together the stream line with the "if over limit, do ellipsis" part.如果我能想出一种方法将流线与“如果超过限制,做省略号”部分放在一起。

str==null ? str : str.substring(0, Math.min(str.length(), 10))

or,要么,

str==null ? "" : str.substring(0, Math.min(str.length(), 10))

Works with null.适用于空值。

// this is how you shorten the length of the string with .. // add following method to your class // 这就是使用 .. 缩短字符串长度的方法 // 将以下方法添加到您的类中

private String abbreviate(String s){
  if(s.length() <= 10) return s;
  return s.substring(0, 8) + ".." ;
}

Here is the Kotlin solution这是 Kotlin 解决方案

One line,一条线,

if (yourString?.length!! >= 10) yourString?.take(90).plus("...") else yourString

Traditional,传统的,

if (yourString?.length!! >= 10) {
  yourString?.take(10).plus("...")
 } else {
  yourString
 }

I want to trim a string if the length exceeds 10 characters.如果长度超过10个字符,我想修剪字符串。

Suppose if the string length is 12 ( String s="abcdafghijkl" ), then the new trimmed string will contain "abcdefgh.." .假设字符串长度为12( String s="abcdafghijkl" ),则新的修剪后的字符串将包含"abcdefgh.."

How can I achieve this?我该如何实现?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM