I have the following HTML:
<tr><td><font color="#306eff">P: </font>9283-1000<font color="#306eff">
OR (newline)
<tr><td><font color="#306eff">P: </font>9283-1000
<font color="#306eff">
I went to regexpal.com and entered the following regex:
P: </font>(.*?)<font
And it matches. But when I do it in Java, it doesn't match:
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
There are multiple regexes I tried on different occasions and they simply don't work in Java. Any suggestions? Thanks!
Your works fine for me.
public static void main(String[] args) {
String data = "<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\"> ";
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
}
This prints: 9283-1000
.
I guess the problem may be in how data
is fed into the program.
Because the code itself is OK as you can see from this output.
Dot does not match newline by default.
Use Pattern rP = Pattern.compile(">P: </font>(.*?)<font", Pattern.DOTALL);
Reference here .
Try this regex instead:
(?ims).*?>P: </font>(.*?)<font.+
Sample code
public static void main(String[] args) {
String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\"> ";
Pattern rP = Pattern.compile("(?ims).*?>P: </font>(.*?)<font.+");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
}
Output
9283-1000
Try this :
String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\"> ";
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
In java only difference is in escape character .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.