[英]Match a string to multiple regex patterns and return the group number
I'm working with regular expressions in Java using Matcher
class 我正在使用
Matcher
类在Java中使用正则表达式
Here is a description of what I have: 这是我所拥有的描述:
I have a multiple regex separated by pipes. 我有多个用管道分隔的正则表达式。 I have to get the group of the words that matches one of the regular expressions.
我必须获得与其中一个正则表达式匹配的单词组。
this is the pattern
variable : 这是
pattern
变量:
private static Pattern pattern = Pattern.compile(
"^TDXF.*\\w+-(\\d+)(\\.\\d+)+_(\\d+\\.)+\\d+|^TD.{3}([0-9]).{4}$|^.*_.*-.*-([0-9]*)\\..*\\..*\\..*$");
and this method is used to return the group number associated to the word. 并且此方法用于返回与单词关联的组号。
private static String getGroup(String gp) {
String g= "";
if (gp== null) {
return g;
}
final Matcher matcher = pattern.matcher(gp);
if (matcher.matches()) {
g= matcher.group(1) != null ? matcher.group(1) : matcher.group(2);
}
return g;
}
I wrote a unit test to test if it works for this string for example : TD91160152
but it failed. 我编写了一个单元测试来测试它是否适用于此字符串,例如:
TD91160152
但失败了。
@Test
public void testGroup() {
Assert.assertEquals("6", this.getStep("TD91160152"));
Assert.assertEquals("2", this.getStep("TDXF-tv-2.5.10.1_0.0.0.0"));
Assert.assertEquals("6", this.getStep("TD91160118_SF11043004"));
Assert.assertEquals("3", this.getStep("TDXF_sih-tv-3.4.12.1_7.21.3.1"));
Assert.assertEquals("5", this.getStep("TD20_sih-tv-5.2.20.1"));
Assert.assertEquals("5", this.getStep("TD30_sih-tv-5.15.8.1"));
}
TD91160152
matches this pattern ^TD.{3}([0-9]).{4}$
and it should return 6
as a matcher.group(1) number : see this demo TD91160152
匹配此模式^TD.{3}([0-9]).{4}$
,它应返回6
作为matcher.group(1)数字: 请参见此演示
I don't know why it fails and return null as a group number. 我不知道为什么它失败并返回null作为组号。 I don't think that it's related to overlaps between the regex.
我认为这与正则表达式之间的重叠无关。
I tried to remove the other patterns and only put 我试图删除其他模式,只放了
private static Pattern pattern = Pattern.compile(
"^TD.{3}([0-9]).{4}$");
and it worked.. I don't know why when I add the other regex it returns null. 并且有效。.我不知道为什么当我添加另一个正则表达式时会返回null。
can anyone help me on this? 谁可以帮我这个事? Thanks a lot.
非常感谢。
Thanks to every one who toke time to read and think with me for a solution. 感谢所有抽出时间与我一起阅读和思考的人,以寻求解决方案。 Hopefully I found a solution.
希望我找到了解决方案。 I ended up by separating the patterns and the test if the word matches the pattern.
最后,我将模式和单词是否与模式匹配进行了测试。 (maybe ther's a confusion somewhere when using the iterator
|
) (使用迭代器
|
时,可能会在某个地方造成混乱)
private static Pattern p1 = Pattern.compile("^TDXF.*\\w+-(\\d+)(\\.\\d+)+_(\\d+\\.)+\\d+");
private static Pattern p2 = Pattern.compile("^TD.{3}([0-9]).{4}$");
private static Pattern p3 = Pattern.compile("^.*_.*-.*-([0-9]*)\\..*\\..*\\..*$");
private static String getGroup(String gp) {
String g = "";
if (gp== null) {
return g;
}
final Matcher matcher1 = p1.matcher(gp);
final Matcher matcher2 = p2.matcher(gp);
final Matcher matcher3 = p3.matcher(gp);
if (matcher1.matches()) {
g= matcher1.group(1) != null ? matcher1.group(1) : matcher1.group(2);
}
if (matcher2.matches()) {
g= matcher2.group(1) != null ? matcher2.group(1) : matcher2.group(2);
}
if (matcher3.matches()) {
g= matcher3.group(1) != null ? matcher3.group(1) : matcher3.group(2);
}
return g;
}
The value TD91160152
matches the second part of your regex but the group is the 4th group in your total regex. 值
TD91160152
与正则表达式的第二部分匹配,但是该组是总正则表达式中的第四组。 So you need to use getGroup(4)
to get 6
因此,您需要使用
getGroup(4)
获得6
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Ff {
static String part1="^TDXF.*\\w+-(\\d+)(\\.\\d+)+_(\\d+\\.)+\\d+$";
static String part2 = "^TD.{3}([0-9]).{4}$";
static String part3 = "^.*_.*-.*-([0-9]*)\\..*\\..*\\..*$";
private static Pattern pattern = Pattern.compile(part1+"|"+part2+"|"+part3);
public static void main(String args[]) {
Matcher m=pattern.matcher("TD98760452");
if(m.matches())
{
for (int i=1;i<=m.groupCount();i++)
System.out.println(m.group(i));
}
}
}
The output is 输出是
null
null
null
6
null
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.