简体   繁体   English

C#正则表达式捕获所有内容

[英]C# Regex capturing everything

I want to only have the text inbetween the parenthesis but for some reason it's giving me the whole thing 我只想在括号之间插入文字,但由于某种原因,它使我得到了整个东西

This is the regex I wrote 这是我写的正则表达式

<a href='ete(.+)'>det

This is the string 这是字符串

</td>
<td>
<a href='ete/d1460852470.html'>detailed list #11</a> (20.94KB)
</td>
<td>
392
</td>
<td>
4/17 12:21:10 am
</td>
</tr>
<tr>
<td>
<a href='ete/1460845272.html'>ete #5</a> (6.71KB)
</td>
<td>
<a href='ete/d1460845272.html'>detailed list #5</a> (19.76KB)
</td>
<td>
372
</td>
<td>
4/16 10:21:12 pm
</td>
</tr>
<tr>
<td>
<a href='ete/1460839272.html'>ete #2</a> (6.62KB)
</td>
<td>
<a href='ete/d1460839272.html'>detailed list #2</a> (19.4KB)
</td>
<td>
366
</td>
<td>
4/16 8:41:12 pm
</td>
</tr>
<tr>
<td>
<a href='ete/1460830870.html'>ete #8</a> (6.72KB)
</td>
<td>
<a href='ete/d1460830870.html'>detailed list #8</a> (19.76KB)
</td>

I only want the text between / and ' 我只希望/'之间的文本

But that doesn't happen right now. 但这并不会立即发生。 I get back a 3 dimensional array. 我得到一个3维数组。

This is the code that https://myregextester.com/index.php produces 这是https://myregextester.com/index.php生成的代码

      String sourcestring = "source string to match with pattern";
      Regex re = new Regex(@"<a href='ete(.+)'>det");
      MatchCollection mc = re.Matches(sourcestring);
      int mIdx=0;
      foreach (Match m in mc)
       {
        for (int gIdx = 0; gIdx < m.Groups.Count; gIdx++)
          {
            Console.WriteLine("[{0}][{1}] = {2}", mIdx, re.GetGroupNames()[gIdx], m.Groups[gIdx].Value);
          }
        mIdx++;
      }

Change the regex to: 将正则表达式更改为:

Regex re = new Regex(@"<a href='ete([^']+)'>det");

and you should get what you are after. 你应该得到你想要的。

It's saying match all the characters that are not the closing quote in the group and then match the '>det after that. 就是说匹配组中所有不是结束引号的字符,然后匹配'>det

Your answer is in your one of the match groups already - m[n].Groups[1] will give you just your capture group. 您的答案已经在您的一个匹配组中m[n].Groups[1]仅会给您捕获组。 m[n].Groups[0] will give you all the text that matched your regular expression, not just your capture group. m[n].Groups[0]将为您提供所有与您的正则表达式匹配的文本,而不仅仅是您的捕获组。

If you want to be pedantic, you can switch to a lookahead and lookbehind, eg (?<=<a href='ete).+(?='>det) , to only match the inner text. 如果您想成为书呆子,可以切换到先行和后行,例如(?<=<a href='ete).+(?='>det) ,仅匹配内部文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM