用于在两个字符之间查找字符串的正则表达式模式-但第二个字符首次出现

Question

I want a regex to find string between two characters but only from start delimiter to first occurrence of end delimiter 我希望正则表达式在两个字符之间找到字符串，但仅从开始定界符到第一次出现结束定界符

I want to extract story from the lines of following format 我想从以下格式的行中提取故事

<metadata name="user" story="{some_text_here}" \/>

So I want to extract only : {some_text_here} 所以我只想提取： {some_text_here}

For that I am using the following regex: 为此，我使用以下正则表达式：

<metadata name="user" story="(.*)" \/>

And java code: 和Java代码：

public static void main(String[] args) throws IOException {
        String regexString = "<metadata name="user" story="(.*)" \/>";
        String filePath = "C:\\Desktop\\temp\\test.txt";
        Pattern p = Pattern.compile(regexString);
        Matcher m;
        try (BufferedReader br = new BufferedReader(new FileReader(filePath))) {
            String line;
            while ((line = br.readLine()) != null) {
                m = p.matcher(line);
                if (m.find()) {                     
                    System.out.println(m.group(1));
                }
            }
        }

    }

This regex mostly works fine but surprisingly if the line is: 这个正则表达式通常可以正常工作，但是如果该行是：

<metadata name="user" story="My name is Nick" extraStory="something" />

Running the code also filters My name is Nick" extraStory="something where as I only want to make sure that I get My name is Nick 运行代码还会过滤My name is Nick" extraStory="something ，我只想确保我得到的My name is Nick

Also I want to make sure that there is actually no information between story="My name is Nick" and before /> 我也想确保在story="My name is Nick"和/>之前之间没有任何信息

Answer 1

<metadata name="user" story="([^"]*)" \/>

[^"]* will match everything except the ". [^“] *将匹配除”之外的所有内容。 In this case the string 在这种情况下，字符串

<metadata name="user" story="My name is Nick" extraStory="something" />

will not be matched. 将不匹配。

Answer 2

The following XPath should solve your problem : 以下XPath应该可以解决您的问题：

//metadata[@name='user' and @story and count(@*) = 2]/@story

It address the story attribute of any metadata node in the document whose name attribute is user , which also has a story attribute but no others (attributes count is 2). 它处理文档中name属性为user的任何metadata节点的story属性，该节点也具有story属性，但没有其他属性（属性计数为2）。

(Note : //metadata[@name='user' and count(@*)=2]/@story would be enough since it would be impossible to address the story attribute of a metadata node whose second attribute isn't story ) （注意： //metadata[@name='user' and count(@*)=2]/@story就足够了，因为不可能解决第二个属性不是story的metadata节点的story属性）

In Java code, supposing you are handling an instance of org.w3c.dom.Document and already have an instance of XPath available, the code would be the following : 在Java代码中，假设您正在处理org.w3c.dom.Document的实例，并且已经有可用的XPath实例，则代码如下：

xPath.evaluate("//metadata[@name='user' and @story and count(@*) = 2]/@story", xmlDoc);

You can try the XPath here or the Java code here . 您可以尝试的XPath 这里或Java代码在这里。

Answer 3

Just use Jsoup . 只需使用Jsoup即可。 right tool for the problem :). 解决问题的正确工具:)。

its this easy : 这很容易：

String html; //read html file

Document document = Jsoup.parse(html);

String story = document.select("metadata[name=user]").attr("story");

System.out.println(story);

用于在两个字符之间查找字符串的正则表达式模式-但第二个字符首次出现

问题描述

3 个解决方案

解决方案1
1 已采纳 2017-01-25 14:20:13

解决方案2
1 2017-01-25 15:12:41

解决方案3
0 2017-01-25 14:31:35

用于在两个字符之间查找字符串的正则表达式模式-但第二个字符首次出现

问题描述

3 个解决方案

解决方案1 1 已采纳 2017-01-25 14:20:13

解决方案2 1 2017-01-25 15:12:41

解决方案3 0 2017-01-25 14:31:35

解决方案1
1 已采纳 2017-01-25 14:20:13

解决方案2
1 2017-01-25 15:12:41

解决方案3
0 2017-01-25 14:31:35