[英]Regex pattern for finding string between two characters - but first occurrence of the second character
I want a regex to find string between two characters but only from start delimiter to first occurrence of end delimiter 我希望正则表达式在两个字符之间找到字符串,但仅从开始定界符到第一次出现结束定界符
I want to extract story from the lines of following format 我想从以下格式的行中提取故事
<metadata name="user" story="{some_text_here}" \/>
So I want to extract only : {some_text_here}
所以我只想提取:
{some_text_here}
For that I am using the following regex: 为此,我使用以下正则表达式:
<metadata name="user" story="(.*)" \/>
And java code: 和Java代码:
public static void main(String[] args) throws IOException {
String regexString = "<metadata name="user" story="(.*)" \/>";
String filePath = "C:\\Desktop\\temp\\test.txt";
Pattern p = Pattern.compile(regexString);
Matcher m;
try (BufferedReader br = new BufferedReader(new FileReader(filePath))) {
String line;
while ((line = br.readLine()) != null) {
m = p.matcher(line);
if (m.find()) {
System.out.println(m.group(1));
}
}
}
}
This regex mostly works fine but surprisingly if the line is: 这个正则表达式通常可以正常工作,但是如果该行是:
<metadata name="user" story="My name is Nick" extraStory="something" />
Running the code also filters My name is Nick" extraStory="something
where as I only want to make sure that I get My name is Nick
运行代码还会过滤
My name is Nick" extraStory="something
,我只想确保我得到的My name is Nick
Also I want to make sure that there is actually no information between story="My name is Nick"
and before />
我也想确保在
story="My name is Nick"
和/>
之前之间没有任何信息
<metadata name="user" story="([^"]*)" \/>
[^"]* will match everything except the ". [^“] *将匹配除”之外的所有内容。 In this case the string
在这种情况下,字符串
<metadata name="user" story="My name is Nick" extraStory="something" />
will not be matched. 将不匹配。
The following XPath should solve your problem : 以下XPath应该可以解决您的问题:
//metadata[@name='user' and @story and count(@*) = 2]/@story
It address the story
attribute of any metadata
node in the document whose name
attribute is user
, which also has a story
attribute but no others (attributes count is 2). 它处理文档中
name
属性为user
的任何metadata
节点的story
属性,该节点也具有story
属性,但没有其他属性(属性计数为2)。
(Note : //metadata[@name='user' and count(@*)=2]/@story
would be enough since it would be impossible to address the story
attribute of a metadata
node whose second attribute isn't story
) (注意:
//metadata[@name='user' and count(@*)=2]/@story
就足够了,因为不可能解决第二个属性不是story
的metadata
节点的story
属性)
In Java code, supposing you are handling an instance of org.w3c.dom.Document
and already have an instance of XPath
available, the code would be the following : 在Java代码中,假设您正在处理
org.w3c.dom.Document
的实例,并且已经有可用的XPath
实例,则代码如下:
xPath.evaluate("//metadata[@name='user' and @story and count(@*) = 2]/@story", xmlDoc);
You can try the XPath here or the Java code here . 您可以尝试的XPath 这里或Java代码在这里 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.