简体   繁体   English

正则表达式匹配,级联标签

[英]Regex Matching, cascaded tags

Hi I am trying to get results from the tags below, what I need to achieve is to get the first match in the tags, then the fifth match, then the ninth match, so the first and then every fifth match. 嗨我想从下面的标签中得到结果,我需要实现的是获得标签中的第一个匹配,然后是第五个匹配,然后是第九个匹配,所以第一个然后是第五个匹配。 So my results would be, Note I realize this isnt the best way to parse HTML but I really only need it for this 所以我的结果是,注意我意识到这不是解析HTML的最好方法,但我真的只需要它

The regex I am using is 我正在使用的正则表达式是

<td class="stat">(.*?)<\/td>

The code I am using is 我正在使用的代码是

private static ObservableCollection<Top> top = new ObservableCollection<Top>();

public void twit_topusers_DownloadStringCompleted(Object sender, DownloadStringCompletedEventArgs e)
    {
            string str;
            // Size the control to fill the form with a margin
            str = (string)e.Result;




            Regex r = new Regex("<td class=\"stat\">(.*?)</td>");
            // Find a single match in the string.
            Match m = r.Match(str);





            while (m.Success)
            {

                testMatch = "";

                //
                testMatch += System.Text.RegularExpressions.Regex.Unescape(m.Groups[0].ToString()).Trim();



                top.Add(new Top(testMatch));
                m = m.NextMatch();

            }

            listBox.ItemsSource = top;


    }



    }

The tags are 标签是

<td class="stat">14307149</td>//FIRST
<td class="stat">679761</td>
<td class="stat">3508</td>
<td class="stat">62 months ago</td>
<td class="stat">1430700</td>//FIFTH
<td class="stat">679761</td>
<td class="stat">3508</td>
<td class="stat">72 months ago</td>
<td class="stat">1430600</td>//NINTH
<td class="stat">679761</td>
<td class="stat">3508</td>
<td class="stat">82 months ago</td>

But the results I am getting are 但我得到的结果是

Match 1 14307149 比赛1 14307149

Match 2 679761 比赛2 679761

Match 3 3508 比赛3 3508

Match 4 62 months ago 比赛4 62个月前

Match 5 1430700 比赛5 1430700

Match 6 679761 比赛6 679761

Match 7 3508 比赛7 3508

Match 8 72 months ago 比赛8 72个月前

Match 9 14307149 比赛9 14307149

Match 10 679761 比赛10 679761

Match 11 3508 比赛11 3508

Match 12 62 months ago 第12场比赛62个月前

The results I need are 我需要的结果是

Match 1 14307149 比赛1 14307149

Match 2 1430700 比赛2 1430700

Match 3 1430600 比赛3 1430600

Can you help me with this? 你能帮帮我吗?

It doesn't look like you're checking for the row number at all. 看起来你根本没有检查行号。 If you simply add a counter, then check if its mod of 4 is zero, you'd be good. 如果你只是添加一个计数器,那么检查它的4的mod是否为零,你会好的。

counter = 0;
while (m.Success)
{
        if( counter % 4 == 0 )
        {
            testMatch = "";

            //
            testMatch += System.Text.RegularExpressions.Regex.Unescape(m.Groups[0].ToString()).Trim();



            top.Add(new Top(testMatch));
            m = m.NextMatch();

        }
        counter++;
}

Note: I am not a WP7 developer, so this code might be slightly off depending on the way WP7's coding system works. 注意:我不是WP7开发人员,因此根据WP7的编码系统的工作方式,此代码可能略有不同。

Change it as follows to match only numbers: 如下更改它以匹配数字:

     <td class="stat">(\d+)<\/td>

If I get you correctly you have to first split the string by months ago and then match the results of the split operation by the above regex. 如果我让你正确,您必须首先通过分割字符串months ago ,然后由上述正则表达式匹配的分割操作的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM