Grouping regular expression

Question

Here is my questions:

I have a very long string with so many values bounded by the different tags. Those values including chinese, english wording and digits.

I wanna to separate by specify pattern. The following is an example: (I want to find a pattern xxxxxx where xxxx is chinese, english, digits or any notation but not include "<" or ">" as those two symbol is for identify the tags)

However, I found some strange for these pattern. The Pattern seems didn't recgonize the first two tag() but the second one

String a = "<f\"number\">4  <f\"number\"><f$n0>14   <h85><f$n0>4    <f$n0>2 <f$n0>2 7   -<f\"Times-Roman\">7<f\"number\">";
Pattern p = Pattern.compile("<f\"number\">[\\P{sc=Han}*\\p{sc=Han}*[a-z]*[A-Z]*[0-9]*^<>]*<f\"number\">");
Matcher m = p.matcher(a);

while(m.find()){
    System.out.println(m.group());
}

The output is as same as my String a

Answer 1

The character class [\\\\P{sc=Han}*\\\\p{sc=Han}*[az]*[AZ]*[0-9]*^<>]* matches 0 or more any character because \\\\P{sc=Han} and \\\\p{sc=Han} are opposite.

I guess you want:

Pattern p = Pattern.compile("<f\"number\">[\\P{sc=Han}a-zA-Z0-9]*<f\"number\">");

You may want to add spaces:

Pattern p = Pattern.compile("<f\"number\">[\\P{sc=Han}a-zA-Z0-9\s]*<f\"number\">");

or:

Pattern p = Pattern.compile("<f\"number\">[^<]*<f\"number\">");

Grouping regular expression

Question

1 answers

solution1
2 2017-01-16 13:23:13

Grouping regular expression

Question

1 answers

solution1 2 2017-01-16 13:23:13

solution1
2 2017-01-16 13:23:13