简体   繁体   中英

Regex parse string in java

I am using Java. I need to parse the following line using regex :

<actions>::=<action><action>|X|<game>|alpha

It should give me tokens <action> , <action> , X and <game>

What kind of regex will work?

I was trying sth like: "<[a-zA-Z]>" but that doesn't take care of X or alpha .

You can try something like this:

String str="<actions>::=<action><action>|X|<game>|alpha";
str=str.split("=")[1];
Pattern pattern = Pattern.compile("<.*?>|\\|.*?\\|");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
    System.out.println(matcher.group());
}

You should have something like this:

String input = "<actions>::=<action><action>|X|<game>|alpha";
Matcher matcher = Pattern.compile("(<[^>]+>)(<[^>]+>)\\|([^|]+)\\|(<[^|]+>)").matcher(input);
while (matcher.find()) {
     System.out.println(matcher.group().replaceAll("\\|", ""));
}

You didn't specefied if you want to return alpha or not, in this case, it doesn't return it.

You can return alpha by adding |\\\\w* to the end of the regex I wrote.

This will return:

<action><action>X<game>

From the original pattern it is not clear if you mean that literally there are <> in the pattern or not, i'll go with that assumption.

String pattern="<actions>::=<(.*?)><(.+?)>\|(.+)\|<(.*?)\|alpha";

For the java code you can use Pattern and Matcher: here is the basic idea:

   Pattern p = Pattern.compile(pattern, Pattern.DOTALL|Pattern.MULTILINE);
   Matcher m = p.matcher(text);
   m.find();
   for (int g = 1; g <= m.groupCount(); g++) {
      // use your four groups here..
   }

You can use following Java regex:

Pattern pattern = Pattern.compile
       ("::=(<[^>]+>)(<[^>]+>)\\|([^|]+)\\|(<[^>]+>)\\|(\\w+)$");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM