简体   繁体   中英

Extracting a group from matched String in Java using regex

I have a list of String containing values like this:

String [] arr = {"${US.IDX_CA}", "${UK.IDX_IO}", "${NZ.IDX_BO}", "${JP.IDX_TK}", "${US.IDX_MT}", "more-elements-with-completely-different-patterns-which-is-irrelevant"};

I'm trying to extract all the IDX_XX from this list. So from above list, i should have, IDX_CA, IDX_IO, IDX_BO etc using regex in Java

I wrote following code:

Pattern pattern = Pattern.compile("(.*)IDX_(\\w{2})");
for (String s : arr){
     Matcher m = pattern.matcher(s);
      if (m.matches()){
        String extract = m.group(1);
        System.out.println(extract);
      }
}

But this does not print anything. Can someone please tell me what mistake am i making. Thanks.

Use the following fix:

String [] arr = {"${US.IDX_CA}", "${UK.IDX_IO}", "${NZ.IDX_BO}", "${JP.IDX_TK}", "${US.IDX_MT}", "more-elements-with-completely-different-patterns-which-is-irrelevant"};
Pattern pattern = Pattern.compile("\\bIDX_(\\w{2})\\b");
for (String s : arr){
     Matcher m = pattern.matcher(s);
      while (m.find()){
        System.out.println(m.group(0)); // Get the whole match
        System.out.println(m.group(1)); // Get the 2 chars after IDX_
      }
}

See the Java demo , output:

IDX_CA
CA
IDX_IO
IO
IDX_BO
BO
IDX_TK
TK
IDX_MT
MT

NOTES :

  • Use \\bIDX_(\\w{2})\\b pattern that matches IDX_ and 2 word chars in between word boundaries and captures the 2 chars after IDX_ into Group 1
  • m.matches needs a full string match, so it is replaced with m.find()
  • if replaced with while in case there are more than 1 match in a string
  • m.group(0) contains the whole match values
  • m.group(1) contains the Group 1 values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM