.NET正则表达式 - 更短的匹配

Question

I have a question regarding .NET regular expressions and how it defines matches. 我有一个关于.NET正则表达式以及它如何定义匹配的问题。 I am writing: 我正在写：

var regex = new Regex("<tr><td>1</td><td>(.+)</td><td>(.+)</td>");
if (regex.IsMatch(str))
{
    var groups = regex.Match(str).Groups;
    var matches = new List<string>();
    for (int i = 1; i < groups.Count; i++)
        matches.Add(groups[i].Value);

    return matches;
}

What I want is get the content of the two following tags. 我想要的是获取以下两个标签的内容。 Instead it returns: 相反，它返回：

 [0]: Cell 1</td><td>Cell 2</td>... [1]: Last row of the table

Why is the first match taking </td> and the rest of the string instead of stopping at </td>? 为什么第一场比赛是</ td>和其余的字符串而不是停在</ td>？

Answer 1

Your regular expression includes 你的正则表达式包括

(.+)

which is a greedy match. 这是一场贪婪的比赛。 Greedy matches extend as far as they can before matching the next character ( < in your case). 贪婪的比赛，只要他们能下一个字符（匹配之前延长<你的情况）。 Try: 尝试：

(.+?)

This is a non-greedy match which extends as little as possible before matching the next character. 这是一个非贪婪的匹配，在匹配下一个字符之前尽可能少地扩展。

Answer 2

You need to specify lazy matching. 您需要指定延迟匹配。 Instead of + , use +? 而不是+ ，使用+? to say that as few chars as possible should match. 说尽可能少的字符应该匹配。

.NET正则表达式 - 更短的匹配

问题描述

2 个解决方案

解决方案1
3 已采纳 2010-05-28 03:31:21

解决方案2
1 2010-05-28 03:32:47

.NET正则表达式 - 更短的匹配

问题描述

2 个解决方案

解决方案1 3 已采纳 2010-05-28 03:31:21

解决方案2 1 2010-05-28 03:32:47

解决方案1
3 已采纳 2010-05-28 03:31:21

解决方案2
1 2010-05-28 03:32:47