简体   繁体   English

Java中的模式匹配问题

[英]Pattern matching issue in Java

I am poor in Regular Expressions. 我的正则表达式很差。 I googled and got basic understanding of it. 我用谷歌搜索并对其有了基本的了解。

I have below requirement: My command may contain some strings with "$(VAR_NAME)" pattern. 我具有以下要求:我的命令可能包含一些带有“ $(VAR_NAME)”模式的字符串。 I need to find out whether it has such type of strings or not. 我需要找出它是否具有这种类型的字符串。 If so, I have to resolve those(I know what should I do, if such strings are there). 如果是这样,我必须解决这些问题(如果有这样的字符串,我知道该怎么办)。 But, problem is, how to find whether command has strings with "$(VAR_NAME)" pattern. 但是,问题是,如何查找命令是否具有带有“ $(VAR_NAME)”模式的字符串。 There might be multiple or zero of such string patterns in my command. 我的命令中可能有多个或零个这样的字符串模式。

As per my knowledge, I have written below code. 据我所知,我写了下面的代码。 If I use, 'pattern1' , in below code, it is matching. 如果我在下面的代码中使用'pattern1' ,则它是匹配的。 But, not with 'pattern' Can someone help in this? 但是,不是使用'pattern'吗?有人可以帮忙吗?

Thank you in advance. 先感谢您。

    final String command = "somescript.file $(ABC_PATH1) $(ENV_PATH2) <may be other args too here>";
    final String pattern = "\\Q$(\\w+)\\E";
    //final String pattern1 = "\\Q$(ABC_PATH1)\\E";

    final Pattern pr = Pattern.compile(pattern);
    final Matcher match = pr.matcher(command);
    if (match.find())
    {
        System.out.println("Found value: " + match.group(0));
    }
    else
    {
        System.out.println("NO MATCH");
    }

您可以使用Pattern.quote("Q$(w+)E")方法添加Pattern以传入编译方法。

 final Pattern pr = Pattern.compile(Pattern.quote("Q$(w+)E"));

I think you are overcomplicating the problem. 我认为您使问题变得过于复杂。
Since $( is a reserved "word", just do this to check if there are occurrences: 由于$(是保留的“单词”,只需执行以下操作以检查是否存在:

command.indexOf("$(");

Usage example: 用法示例:

public class Test
{
   private static final String[] WORDS;

   static {
      WORDS = new String[] {
            "WORD1",
            "WORD2"
      };
   }

   public static void main(final String[] args) {
      String command = "somescript.file $(ABC_PATH1) $(ENV_PATH2)";

      int index = 0;
      int i = 0;

      while (true) {
         index = command.indexOf("$(", index);

         if (index < 0) {
            break;
         }

         command = command.replace(command.substring(index, command.indexOf(")", index) + 1), WORDS[i++]);
      }
   }
}

It prints: somescript.file WORD1 WORD2 它打印: somescript.file WORD1 WORD2

Sticking to the original source: 坚持原始来源:

public class Test
{
   public static void main(final String[] args) {
      final String command = "somescript.file $(ABC_PATH1) $(ENV_PATH2)";
      int index = 0;
      int occurrences = 0;

      while (true) {
         index = command.indexOf("$(", index);

         if (index < 0) {
            break;
         }

         occurrences++;
         System.out.println(command.substring(index, command.indexOf(")", index++) + 1));
      }

      if (occurrences < 1) {
         System.out.println("No placeholders found");
      }
   }
}

Using \\Q and \\E will mean you cannot setup a capture group for the variable name because the round brackets will be interpreted literally. 使用\\ Q和\\ E将意味着您无法为变量名称设置捕获组,因为圆括号将按字面意义进行解释。

I'd probably do it like this, just escape the outer $, ( and ). 我可能会这样,只是逃脱外面的$,(和)。

Also if you need multiple matches you need to call find() multiple times, I've used a while loop for this. 另外,如果您需要多个匹配项,则需要多次调用find(),为此我使用了while循环。

final String command = "somescript.file $(ABC_PATH1) $(ENV_PATH2) <may be other args too here>";
final String pattern = "\\$\\((\\w+)\\)";

final Pattern pr = Pattern.compile(pattern);
final Matcher match = pr.matcher(command);
while (match.find()) {
    System.out.println("Found value: " + match.group(1));
}

Output 输出量

Found value: ABC_PATH1
Found value: ENV_PATH2

The pattern could look like: 该模式可能类似于:

public static void main(String[] args) {
    final String command = "somescript.file $(ABC_PATH1) $(ENV_PATH2) <may be other args too here>";
    final String pattern = "\\$\\((.*?)\\)";
    // final String pattern1 = "\\Q$(ABC_PATH1)\\E";

    final Pattern pr = Pattern.compile(pattern);
    final Matcher match = pr.matcher(command);

    while (match.find()) {
        System.out.println("Found value: " + match.group(1));
    }

}

prints: 印刷品:

    Found value: ABC_PATH1
    Found value: ENV_PATH2

The problem is that quoting applies to the \\w+ in the pattern as well and I think it was not the intention (as it is, it matches the string "cmd $(\\w+)" that includes the backslash, 'w' and plus sign). 问题是引号也适用于该模式中的\\ w +,我认为这不是故意的(因为它与包含反斜杠,“ w”和加号的字符串“ cmd $(\\ w +)”匹配)标志)。

The pattern can be replaced with: 该模式可以替换为:

    final String pattern = "\\$\\(\\w+\\)";

Or, if you'd still like to use \\Q and \\E on the first part: 或者,如果您仍想在第一部分使用\\ Q和\\ E:

    final String pattern = "\\Q$(\\E\\w+\\)";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM