简体   繁体   English

提取Java中的子字符串

[英]extract a substring in Java

I have the following string: 我有以下字符串:

"hello this.is.a.test(MainActivity.java:47)"

and I want to be able to extract the MainActivity.java:47 (everything that is inside '(' and ')' and only the first occurance). 并且我希望能够提取MainActivity.java:47 (“('和')”中的所有内容,只有第一次出现)。

I tried with regex but it seems that I am doing something wrong. 我尝试使用正则表达式,但似乎做错了什么。

Thanks 谢谢

You can do it yourself: 你可以自己做:

int pos1 = str.indexOf('(') + 1;
int pos2 = str.indexOf(')', pos1);

String result = str.substring(pos1, pos2)

Or you can use commons-lang which contains a very nice StringUtils class that has substringBetween() 或者,您可以使用commons-lang,它包含一个非常好的StringUtils类,该类具有substringBetween()

I think Regex is a liitle bit an overkill. 我认为Regex有点过分。 I would use something like this: 我会用这样的东西:

String input = "hello this.is.a.test(MainActivity.java:47)";
String output = input.subString(input.lastIndexOf("(") + 1, input.lastIndexOf(")"));

This should work: 这应该工作:

^[^\\(]*\\(([^\\)]+)\\)

The result is in the first group. 结果在第一组中。

Another answer for your question : 您问题的另一个答案:


String str = "hello this.is.a.test(MainActivity.java:47) another.test(MyClass.java:12)";
Pattern p = Pattern.compile("[a-z][\\w]+\\.java:\\d+", Pattern.CASE_INSENSITIVE);
Matcher m=p.matcher(str);

if(m.find()) {
    System.out.println(m.group());
}

The RegExp explained : RegExp解释:

[az][\\w]+\\.java:\\d+

[az] > Check that we start with a letter ... [az]>检查我们是否以字母开头...
[\\w]+ > ... followed by a letter, a digit or an underscore... [\\ w] +> ...后跟字母,数字或下划线...
\\.java: > ... followed exactly by the string ".java:"... \\ .java:> ...后面紧跟字符串“ .java:” ...
\\d+ > ... ending by one or more digit(s) \\ d +> ...以一位或多位数字结尾

Pseudo-code: 伪代码:

int p1 = location of '('
int p2 = location of ')', starting the search from p1
String s = extract string from p1 to p2

String.indexOf() and String.substring() are your friends. String.indexOf()String.substring()是您的朋友。

Try this: 尝试这个:

String input = "hello this.is.a.test(MainActivity.java:47) (and some more text)";
Pattern p = Pattern.compile("[^\\)]*\\(([^\\)]*)\\).*");
Matcher m = p.matcher( input );
if(m.matches()) {
  System.out.println(m.group( 1 )); //output: MainActivity.java:47
}

This also finds the first occurence of text between ( and ) if there are more of them. 如果还有更多的话,这还会找到(和)之间的第一个文本。

Note that in Java you normally have the expressions wrapped with ^ and $ implicitly (or at least the same effect), ie the regex must match the entire input string. 请注意,在Java中,通常会将表达式用^$隐式包装(或至少具有相同的效果),即正则表达式必须与整个输入字符串匹配。 Thus [^\\\\)]* at the beginning and .* at the end are necessary. 因此,开头的[^\\\\)]*和结尾的.*是必需的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM