简体   繁体   English

将字符串与多个正则表达式模式匹配,然后返回组号

[英]Match a string to multiple regex patterns and return the group number

I'm working with regular expressions in Java using Matcher class 我正在使用Matcher类在Java中使用正则表达式

Here is a description of what I have: 这是我所拥有的描述:

I have a multiple regex separated by pipes. 我有多个用管道分隔的正则表达式。 I have to get the group of the words that matches one of the regular expressions. 我必须获得与其中一个正则表达式匹配的单词组。

this is the pattern variable : 这是pattern变量:

private static Pattern pattern = Pattern.compile(
        "^TDXF.*\\w+-(\\d+)(\\.\\d+)+_(\\d+\\.)+\\d+|^TD.{3}([0-9]).{4}$|^.*_.*-.*-([0-9]*)\\..*\\..*\\..*$");

and this method is used to return the group number associated to the word. 并且此方法用于返回与单词关联的组号。

private static String getGroup(String gp) {
    String g= "";
    if (gp== null) {
        return g;
    }

    final Matcher matcher = pattern.matcher(gp);
    if (matcher.matches()) {
 g= matcher.group(1) != null ? matcher.group(1) : matcher.group(2);
    }
    return g;
}

I wrote a unit test to test if it works for this string for example : TD91160152 but it failed. 我编写了一个单元测试来测试它是否适用于此字符串,例如: TD91160152但失败了。

 @Test
public void testGroup() {
    Assert.assertEquals("6", this.getStep("TD91160152"));
    Assert.assertEquals("2", this.getStep("TDXF-tv-2.5.10.1_0.0.0.0"));
    Assert.assertEquals("6", this.getStep("TD91160118_SF11043004"));

    Assert.assertEquals("3", this.getStep("TDXF_sih-tv-3.4.12.1_7.21.3.1"));
    Assert.assertEquals("5", this.getStep("TD20_sih-tv-5.2.20.1"));
    Assert.assertEquals("5", this.getStep("TD30_sih-tv-5.15.8.1"));
}

TD91160152 matches this pattern ^TD.{3}([0-9]).{4}$ and it should return 6 as a matcher.group(1) number : see this demo TD91160152匹配此模式^TD.{3}([0-9]).{4}$ ,它应返回6作为matcher.group(1)数字: 请参见此演示

I don't know why it fails and return null as a group number. 我不知道为什么它失败并返回null作为组号。 I don't think that it's related to overlaps between the regex. 我认为这与正则表达式之间的重叠无关。

I tried to remove the other patterns and only put 我试图删除其他模式,只放了

private static Pattern pattern = Pattern.compile(
        "^TD.{3}([0-9]).{4}$");

and it worked.. I don't know why when I add the other regex it returns null. 并且有效。.我不知道为什么当我添加另一个正则表达式时会返回null。

can anyone help me on this? 谁可以帮我这个事? Thanks a lot. 非常感谢。

I tested your regex, looks like its fine. 我测试了您的正则表达式,看起来不错。 Maybe you need other function or flags in your regex expression? 也许您在正则表达式中需要其他功能或标志? Screenshots i made here 我在这里制作的屏幕截图

Regex 正则表达式

Matches 火柴

Groups 团体

Thanks to every one who toke time to read and think with me for a solution. 感谢所有抽出时间与我一起阅读和思考的人,以寻求解决方案。 Hopefully I found a solution. 希望我找到了解决方案。 I ended up by separating the patterns and the test if the word matches the pattern. 最后,我将模式和单词是否与模式匹配进行了测试。 (maybe ther's a confusion somewhere when using the iterator | ) (使用迭代器|时,可能会在某个地方造成混乱)

private static Pattern p1 = Pattern.compile("^TDXF.*\\w+-(\\d+)(\\.\\d+)+_(\\d+\\.)+\\d+");

private static Pattern p2 = Pattern.compile("^TD.{3}([0-9]).{4}$");

private static Pattern p3 = Pattern.compile("^.*_.*-.*-([0-9]*)\\..*\\..*\\..*$");

private static String getGroup(String gp) {
    String g = "";
    if (gp== null) {
        return g;
    }

    final Matcher matcher1 = p1.matcher(gp);
    final Matcher matcher2 = p2.matcher(gp);
    final Matcher matcher3 = p3.matcher(gp);

    if (matcher1.matches()) {
        g= matcher1.group(1) != null ? matcher1.group(1) : matcher1.group(2);
    }
    if (matcher2.matches()) {
        g= matcher2.group(1) != null ? matcher2.group(1) : matcher2.group(2);
    }
    if (matcher3.matches()) {
        g= matcher3.group(1) != null ? matcher3.group(1) : matcher3.group(2);
    }
    return g;
}

The value TD91160152 matches the second part of your regex but the group is the 4th group in your total regex. TD91160152与正则表达式的第二部分匹配,但是该组是总正则表达式中的第四组。 So you need to use getGroup(4) to get 6 因此,您需要使用getGroup(4)获得6

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Ff {
    static String part1="^TDXF.*\\w+-(\\d+)(\\.\\d+)+_(\\d+\\.)+\\d+$";
    static String part2 = "^TD.{3}([0-9]).{4}$";
    static String part3 = "^.*_.*-.*-([0-9]*)\\..*\\..*\\..*$";
    private static Pattern pattern = Pattern.compile(part1+"|"+part2+"|"+part3);


    public static void main(String args[]) {
        Matcher m=pattern.matcher("TD98760452");
        if(m.matches())
        {
            for (int i=1;i<=m.groupCount();i++)
                System.out.println(m.group(i));
        }
    }
}

The output is 输出是

null
null
null
6
null

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM