简体   繁体   English

xpath和htmlagility包

[英]xpath and htmlagility pack

I figured it out! 我想到了! I will leave this posted just in case some other newbie like myself has the same question. 万一像我这样的其他新手有同样的问题,我将保留此帖子。

Answer: **("./td[2]/span[@class='smallfont']")** * 答案: **("./td[2]/span[@class='smallfont']")** *

I am a novice at xpath and html agility. 我是xpath和html敏捷的新手。 I am so close yet so far. 我离现在很近。

GOAL: to pull out 4:30am 目标: 早上4:30退出

by using the following with htmlagility pack: 通过将以下内容与htmlagility包一起使用:

foreach (HtmlNode table in doc.DocumentNode.SelectNodes("//table[@id='weekdays']/tr[2]")){
string time = table.SelectSingleNode("./td[2]").InnerText;

I get it down to "\\r\\n\\t\\t\\r\\n\\t\\t\\t4:30am\\r\\n\\t\\t\\r\\n\\t" when I try doing anything with the span I get xpath exceptions. 当我尝试对跨度执行任何操作时,我将其归结为“ \\ r \\ n \\ t \\ t \\ r \\ n \\ t \\ t \\ t \\ t4:30am \\ r \\ n \\ t \\ t \\ r \\ n \\ t”获取xpath异常。 What must I add to the ("./td[2]") to just end up with the 4:30am? 我必须添加什么到(“ ./td[2]”)才能在凌晨4:30结束?

HTML
<td class="alt1 espace" nowrap="nowrap" style="text-align: center;">
<span class="smallfont">4:30am</span>
</td>

I don't know if Linq is an option, but you could have also done something like this: 我不知道是否可以使用Linq,但是您也可以这样做:

        var time = string.Empty;
        var html =
            "<td class=\"alt1 espace\" nowrap=\"nowrap\" style=\"text-align: center;\"><span class=\"smallfont\">4:30am</span></td>";

        var document = new HtmlDocument() { OptionWriteEmptyNodes = true, OptionOutputAsXml = true };

        document.LoadHtml(html);

        var timeSpan =
            document.DocumentNode.Descendants("span").Where(
                n => n.Attributes["class"] != null && n.Attributes["class"].Value == "smallfont").FirstOrDefault();

        if (timeSpan != null)
            time = timeSpan.InnerHtml;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM