Regex \n doesn't work

Question

I'm trying to parse text out of two lines of HTML.

Dim PattStats As New Regex("class=""head"">(.+?)</td>"+ 
                           "\n<td>(.+?)</td>")
Dim makor As MatchCollection = PattStats.Matches(page)

For Each MatchMak As Match In makor
    ListView3.Items.Add(MatchMak.Groups(1).Value)
Next

I added the \\n to match the next line, but for some reason it won't work. Here's the source I'm running the regex against.

<table class="table table-striped table-bordered table-condensed">
  <tbody>
    <tr>
      <td class="head">Health Points:</td>
      <td>445 (+85 / per level)</td>
      <td class="head">Health Regen:</td>
      <td>7.25</td>
    </tr>
    <tr>
      <td class="head">Energy:</td>
      <td>200</td>
      <td class="head">Energy Regen:</td>
      <td>50</td>
    </tr>
    <tr>
      <td class="head">Damage:</td>
      <td>53 (+3.2 / per level)</td>
      <td class="head">Attack Speed:</td>
      <td>0.694 (+3.1 / per level)</td>
    </tr>           
    <tr>
      <td class="head">Attack Range:</td>
      <td>125</td>
      <td class="head">Movement Speed:</td>
      <td>325</td>
    </tr>
    <tr>
      <td class="head">Armor:</td>
      <td>16.5 (+3.5 / per level)</td>
      <td class="head">Magic Resistance:</td>
      <td>30 (+1.25 / per level)</td>
    </tr>       
    <tr>
      <td class="head">Influence Points (IP):</td>
      <td>3150</td>
      <td class="head">Riot Points (RP):</td>
      <td>975</td>
    </tr>
  </tbody>
</table>

I'd like to match the first <td class...> and the following line in one regex :/

Answer 1

Description

This regex will find td tags and return them in groups of two.

<td\\b[^>]*>([^<]*)<\\/td>[^<]*<td\\b[^>]*>([^<]*)<\\/td>

在此处输入图片说明

Summary

<td\\b[^>]*> find the first td tag and consume any attributes
([^<]*) capture the first inner text, this can be greedy but we assume the cell has no nested tags
<\\/td> find the close tag
[^<]* move past all the rest of the text until you, this assumes there are no additional tags between the first and second td tag
<td\\b[^>]*> find the second td tage and consume any attributes
([^<]*) capture the second inner text, this can be greedy but we assume the cell has no nested tags
<\\/td> find the close tag

Groups

Group 0 will get the entire string

will have the first td group
will have the second td group

VB.NET Code Example:

Disclaimer

Parsing html with a regex is really not the best solution as there a ton of edge cases what we can't predict. However in this case if input string is always this basic, and you're willing to accept the risk of the regex not working 100% of the time, then this solution would probably work for you.

Regex \n doesn't work

Question

1 answers

solution1
1 2013-05-29 12:27:37

Description

Summary

Groups

VB.NET Code Example:

Disclaimer

Regex \n doesn't work

Question

1 answers

solution1 1 2013-05-29 12:27:37

Description

Summary

Groups

VB.NET Code Example:

Disclaimer

solution1
1 2013-05-29 12:27:37