简体   繁体   中英

Regular expression to parse option string

I'm using the Java matcher to try and match the following:

@tag TYPE_WITH_POSSIBLE_SUBTYPE -PARNAME1=PARVALUE1 -PARNAME2=PARVALUE2: MESSAGE

The TYPE_WITH_POSSIBLE_SUBTYPE consists of letters with periods.

Every parameter has to consist of letters, and every value has to consist of numerics/letters. There can be 0 or more parameters. Immediately after the last parameter value comes the semicolon, a space, and the remainder is considered message.

Everything needs to be grouped.

My current regexp (as a Java literal) is:

(@tag)[\\s]+?([\\w\\.]*?)[\\s]*?(-.*=.*)*?[\\s]*?[:](.*)

However, I keep getting all the parameters as one group. How do I get each as a separate group, if it is even possible?

I don't work that much with regexps, so I always mess something up.

If you want to capture each parameter separately, you have to have a capture group for each one. Of course, you can't do that because you don't know how many parameters there will be. I recommend a different approach:

Pattern p = Pattern.compile("@tag\\s+([^:]++):\\s*(.*)");
Matcher m = p.matcher(s);
if (m.find())
{
  String[] parts = m.group(1).split("\\s+");
  for (String part : parts)
  {
    System.out.println(part);
  }
}
System.out.printf("message: %s%n", m.group(2));

The first element in the array is your TYPE name and the rest (if there are any more) are the parameters.

Try this out (you may need to add extra '\\' to make it work within a string.

(@tag)\s*(\w*)\s*(-[\w\d]*=[\w\d]*\s*)*:(.*)

By the way, I highly recommend this site to help you build regular expressions: RegexPal . Or even better is RegexBuddy ; its well worth the $40 if you plan on doing a lot of regular expressions in the future.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM