[英]Use Java Pattern to extract word between HTML tag with attributes
I am using the java Pattern & Matcher to extract the words between two tags. 我正在使用Java Pattern&Matcher提取两个标签之间的单词。
My code is like: 我的代码是这样的:
final Pattern pattern = Pattern.compile("<([A-Za-z][A-Za-z0-9]*)\\b[^>]*>(.*?)</\\1>");
List<String> topicArray = new ArrayList<String>();
final Matcher matcher = pattern.matcher("<City count='1' relevance='0.304' normalized='Shanghai,China'>Shanghai</City>");
while (matcher.find()) {
topicArray.add(matcher.group(1));
}
The system only gives me City as output instead of Shanghai. 系统仅给我输出City而不是Shanghai。 What's wrong with it?
它出什么问题了?
Thanks 谢谢
You can try the next: 您可以尝试下一个:
private static final Pattern REGEX_PATTERN =
Pattern.compile("<[^>]*>([^<>]*)<[^>]*>");
public static void main(String[] args) {
String input = "<City count='1' relevance='0.304' normalized='Shanghai,China'>Shanghai</City>";
System.out.println(
REGEX_PATTERN.matcher(input).replaceAll("$1")
); // prints "Shanghai"
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.