简体   繁体   English

在Java中修剪字符串的可能前缀

[英]Trim a possible prefix of a string in Java

I have String str , from which I want to extract the sub-string excluding a possible prefix "abc" . 我有String str ,我想从中提取除了可能的前缀"abc"之外的子字符串。

The first solution that comes to mind is: 首先想到的解决方案是:

if (str.startsWith("abc"))
    return str.substring("abc".length());
return str;

My questions are: 我的问题是:

  1. Is there a "cleaner" way to do it using split and a regular expression for an "abc" prefix ? 使用split"abc" 前缀的正则表达式是否有“更清洁”的方法?

  2. If yes, is it less efficient than the method above (because it searches "throughout" the string)? 如果是,它的效率是否低于上述方法(因为它在整个字符串中搜索)?

  3. If yes, is there any better way of doing it (where "better way" = clean and efficient solution)? 如果是的话,有没有更好的方法(“更好的方式”=清洁和有效的解决方案)?

Please note that the "abc" prefix may appear elsewhere in the string, and should not be removed. 请注意, "abc"前缀可能出现在字符串的其他位置,不应删除。

Thanks 谢谢

Shorter than above code will be this line: 这行代码比上面的代码短:

return str.replaceFirst("^abc", "");

But in terms of performance I guess there wont be any substantial difference between 2 codes. 但就性能而言,我猜两个代码之间不会有任何实质性差异。 One uses regex and one doesn't use regex but does search and substring. 一个使用正则表达式,一个不使用正则表达式,但搜索和子串。

Using String.replaceFirst with ^abc (to match leading abc ) 使用String.replaceFirst^abc (匹配前导abc

"abcdef".replaceFirst("^abc", "")     // => "def"
"123456".replaceFirst("^abc", "")     // => "123456"
"123abc456".replaceFirst("^abc", "")  // => "123abc456"
  1. Using String#split can do this, but it's not better solution. 使用String#split可以做到这一点,但它不是更好的解决方案。 Actually it'll be vague and I wouldn't recommend using it for that purpose. 实际上它会模糊不清,我不建议将它用于此目的。
  2. Don't waste time about efficiency in this case, it's not significant, focus on logic and clarity. 在这种情况下,不要浪费时间关注效率,它并不重要,注重逻辑和清晰度。 But note that working with regex is usually slower because it involves additional operations so you might want to keep startsWith . 但请注意,使用正则表达式通常较慢,因为它涉及额外的操作,因此您可能希望保持startsWith
  3. Your approach is fine, if you want to check if the String begins with "abc", String#startsWith was designed for that. 你的方法很好,如果你想检查String是否以“abc”开头, String#startsWith是为此设计的。

You can easily measure the time that takes a code to run. 您可以轻松测量代码运行所需的时间。 Here what you can do: 在这里你可以做什么:

Create a big loop, inside it you can append the counter of it to some dummy String in order to simulate the Strings you want to check, then try to have startsWith once, and replaceAll after: 创建一个大循环,在其中你可以将它的计数器附加到一些虚拟字符串以模拟你想要检查的字符串,然后尝试使用startsWith一次,并在之后replaceAll

for(int i = 0;i<900000;i++) {
    StringBuilder sb = new StringBuilder("abc");
    sb.append(i);
    if(sb.toString().startsWith("abc")) { ... } 
}
long time = System.currentTimeMillis() - start;
System.out.println(time); //Prints ~130

for(int i = 0;i<900000;i++){
   StringBuilder sb = new StringBuilder("abc");
   sb.append(i);
   sb.toString().replaceAll("^abc", "");        
}
long time = System.currentTimeMillis() - start;
System.out.println(time);  //Prints ~730

试试这个

str = str.replaceAll("^abc", "");

A regex-free solution (I needed this because the string I'm removing is configurable and contains backslashes, which need escaping for literal use in a regex): 一个无正则表达式的解决方案(我需要这个,因为我正在移除的字符串是可配置的并包含反斜杠,需要转义才能在正则表达式中直接使用):

Apache Commons Lang StringUtils.removeStart(str, remove) will remove remove from the start of str using String.startsWith and String.substring . Apache Commons Lang StringUtils.removeStart(str, remove)将使用String.startsWithString.substringstr的开头删除remove

The source code of the method is informative: 该方法的源代码是提供信息的:

public static String removeStart(final String str, final String remove) {
    if (isEmpty(str) || isEmpty(remove)) {
        return str;
    }
    if (str.startsWith(remove)){
        return str.substring(remove.length());
    }
    return str;
}

If you are concerned about performance you can improve str.replaceFirst("^abc", "") solution by using same pre-compiled prefix Pattern for matching multiple strings. 如果您担心性能,可以使用相同的预编译前缀Pattern来匹配多个字符串str.replaceFirst("^abc", "")从而改进str.replaceFirst("^abc", "")解决方案。

final Pattern prefix = Pattern.compile("^abc"); // Could be static constant etc
for ... {
    final String result = prefix.matcher(str).replaceFirst("");
}

I guess the difference will be noticeable if you stripping the same prefix from a lot of strings. 我想如果你从很多字符串中剥离相同的前缀,那么差别将是显而易见的。

As far as efficiency is concerned you may use StringBuilder where you have multiple operations on one string such as substring then, finding index, then substring etc etc. 就效率而言,您可以使用StringBuilder ,其中您对一个字符串有多个操作,例如substring,查找索引,然后查找子字符串等。


Where cleanliness/efficiency is concerned, StringUtils (Apache Commons Lang) can be used. 在清洁度/效率方面,可以使用StringUtils (Apache Commons Lang)

Hope it helps. 希望能帮助到你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM