简体   繁体   English

括号中的字母或数字的正则表达式

[英]regex for letters or numbers in brackets

I am using Java to process text using regular expressions. 我使用Java来处理使用正则表达式的文本。 I am using the following regular expression 我使用以下正则表达式

^[\\([0-9a-zA-Z]+\\)\\s]+

to match one or more letters or numbers in parentheses one or more times. 将括号中的一个或多个字母或数字匹配一次或多次。 For instance, I like to match (aaa) (bb) (11) (AA) (iv) or (111) (aaaa) (i) (V) 例如,我喜欢匹配(aaa)(bb)(11)(AA)(iv)或(111)(aaaa)(i)(V)

I tested this regular expression on http://java-regex-tester.appspot.com/ and it is working. 我在http://java-regex-tester.appspot.com/上测试了这个正则表达式,它正在运行。 But when I use it in my code, the code does not compile. 但是当我在我的代码中使用它时,代码无法编译。 Here is my code: 这是我的代码:

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class Tester {

        public static void main(String[] args) {

            Pattern pattern = Pattern.compile("^[\([0-9a-zA-Z]+\)\s]+");

            String[] words = pattern.split("(a) (1) (c) (xii) (A) (12) (ii)");

            String w = pattern.

            for(String s:words){

                System.out.println(s);

            }
    }
}

I tried to use \\ instead of \\ but the regex gave different results than what I expected (it matches only one group like (aaa) not multiple groups like (aaa) (111) (ii). 我尝试使用\\而不是\\但正则表达式给出的结果与我预期的不同(它只匹配一组像(aaa)而不是多组如(aaa)(111)(ii)。

Two questions: 两个问题:

  1. How can I fix this regex and be able to match multiple groups? 如何修复此正则表达式并能够匹配多个组?
  2. How can I get the individual matches separately (like (aaa) alone and then (111) and so on). 如何单独获得单独的匹配(如(aaa)单独然后(111)等)。 I tried pattern.split but did not work for me. 我尝试过pattern.split,但对我没用。

Firstly, you want to escape any backslashes in the quotation marks with another backslash. 首先,您希望使用另一个反斜杠转义引号中的任何反斜杠。 The Regex will treat it as a single backslash. 正则表达式将它视为一个反斜杠。 (Eg call a word character \\w in quotation marks, etc.) (例如在引号中调用单词字符\\ w等)

Secondly, you got to finish the line that reads: 其次,你必须完成以下行:

String w = pattern.

That line explains why it doesn't compile. 该行解释了为什么它不编译。

Here is my final solution to match the individual groups of letters/numbers in brackets that appear at the beginning of a line and ignore the rest 这是我的最终解决方案,用于匹配出现在行首的括号中的各个字母/数字组,而忽略其余部分

import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Tester {
    static ArrayList<String> listOfEnums;
    public static void main(String[] args) {
        listOfEnums = new ArrayList<String>();
        Pattern pattern = Pattern.compile("^\\([0-9a-zA-Z^]+\\)");
        String p = "(a) (1) (c) (xii) (A) (12) (ii) and the good news (1)";
        Matcher matcher = pattern.matcher(p);
        boolean isMatch = matcher.find();
        int index = 0;
        //once you find a match, remove it and store it in the arrayList. 
        while (isMatch) {
          String s = matcher.group();
          System.out.println(s);
          //Store it in an array
          listOfEnums.add(s);
          //Remove it from the beginning of the string.
          p = p.substring(listOfEnums.get(index).length(), p.length()).trim();
          matcher = pattern.matcher(p);
          isMatch = matcher.find();
          index++;
        }
    }
}

1) Your regex is incorrect. 1)你的正则表达式是不正确的。 You want to match individual groups of letters / numbers in brackets, and the current regex will match only a single string of one or more such groups. 您希望在括号中匹配单个字母/数字组,并且当前正则表达式将仅匹配一个或多个此类组的单个字符串。 Ie it will match 即它会匹配

(abc) (def) (123)

as a single group rather than three separate groups. 作为一个单独的群体而不是三个独立的群体。

A better regex that would match only up to the closing bracket would be 一个更好的正则表达式只能匹配结束括号

\([0-9a-zA-Z^\)]+\)

2) Java requires you to escape all backslashes with another backslash 2)Java要求您使用另一个反斜杠转义所有反斜杠

3) The split() method will not do what you want. 3) split()方法不会做你想要的。 It will find all matches in your string then throw them away and return an array of what is left over. 它会在你的字符串中找到所有匹配项然后将它们抛弃并返回剩下的数组。 You want to use matcher() instead 您想要使用matcher()

Pattern pattern = Pattern.compile("\\([0-9a-zA-Z^\\)]+\\)");
Matcher matcher = pattern.matcher("(a) (1) (c) (xii) (A) (12) (ii)");

while (matcher.find()) {
  System.out.println(matcher.group());
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM