I have a json feed that feeds html that is used to populate the calendar, I need to retrieve some of the information from it. For example title, time and location. I wanted to use regex to get content between
<span class=\"title\">
and
<\/span><br/><b>
and I am trying to use this code
for(int i = 0; i < json.length(); i++)
{
JSONObject object = new JSONObject(json.getJSONObject(i));
System.out.println(object.getNames(object));
Pattern p = Pattern.compile("(?i)(<span class=\"title\">)(.+?)(<\\/span>)");
Matcher m = p.matcher(json.get(0).toString());
m.find();
System.out.println(m.group(0));
But it doesn't seem to do the job... I have tried multiple ittoriations and tried researching examples online, but I am not sure if I am doing something wrong in the regex syntax. Help would be appreciated.
{"hoverContent":"<b>Title: <\/b><span class=\"title\">Accounting Awareness<\/span><br/><b>Time: <\/b><span class=\"time\">5:30 PM - 6:30 PM<br/><b>Location: <\/b><span class=\"location\">1185 Grainger Hall<\/span><br/><b>Description: <\/b><br/><span class=\"description\">Information from Kristen Fuhremann, Director of Professional Programs in Accounting and Q&A from a panel of current and former students who will share their experiences in the accounting program. Panel includes a grad of the IMAcc program currently in law school, a candidate for the IMAcc program who studied abroad, an accounting and finance double major, and an IMAcc student who is also a TA for AIS 100. Casual Attire is appropriate.<br />Contact: Natalie Dickson, <a href=\"mailto:ndickson@wisc.edu\">ndickson@wisc.edu<\/a><\/span><br/>","title":"Accounting Awareness","start":"2013-09-30 17:30:00","allDay":false,"itemId":"2356754a-8178-4afd-b4cf-7f5f5ce89868","end":"2013-09-30 18:30:00"}
null
m.group(0)
always returns the entire string that matches the regex. It looks like you want to return a particular group, so you need to use m.group(1)
to get the text that matches the first group, m.group(2)
for the second group, and so on. In this regex:
"(?i)(<span class=\"title\">)(.+?)(<\\/span>)"
anything in parentheses, except for things that begin with (?
, counts as a group, so the portion in (.+?)
is the second capture group, and you can try retrieving it with m.group(2)
. In this case, there's no need to put the <span
stuff in parentheses, so you could say
"(?i)<span class=\"title\">(.+?)<\\/span>"
and now use m.group(1)
to get at the first (and only) capture group.
Using regexp to parse something is not really a good idea from design standpoint. I would personally just wrap the content in a fake tag and parse it using XML parser. There will be overhead, but you don't use regexp to parse JSON, right? Why not do the same for XML?
尝试使用DOTALL
模式的这个正则表达式,也避免冗余转义:
Pattern p = Pattern.compile("(?si)<span class=\"title\">(.+?)</span>");
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.