I am using the java Pattern & Matcher to extract the words between two tags.
My code is like:
final Pattern pattern = Pattern.compile("<([A-Za-z][A-Za-z0-9]*)\\b[^>]*>(.*?)</\\1>");
List<String> topicArray = new ArrayList<String>();
final Matcher matcher = pattern.matcher("<City count='1' relevance='0.304' normalized='Shanghai,China'>Shanghai</City>");
while (matcher.find()) {
topicArray.add(matcher.group(1));
}
The system only gives me City as output instead of Shanghai. What's wrong with it?
Thanks
You can try the next:
private static final Pattern REGEX_PATTERN =
Pattern.compile("<[^>]*>([^<>]*)<[^>]*>");
public static void main(String[] args) {
String input = "<City count='1' relevance='0.304' normalized='Shanghai,China'>Shanghai</City>";
System.out.println(
REGEX_PATTERN.matcher(input).replaceAll("$1")
); // prints "Shanghai"
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.