简体   繁体   English

如何使用htmlagility获取标签及其文本

[英]How to get the tag and its text using htmlagility

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
            string html = null;
            html =
            "<body> " +
                "<p class=\"hang12\">“What is Lorem Ipsum?” <i>Lorem Ipsum is simply dummy text</i> Lorem Ipsum has been the</p>" +
                "<p class=\"hang12\">when an unknown printer took a galley of type <i>It has survived not only five centuries,</i>.</p>" +
                "<p class=\"hang12\">but also the  <i>remaining essentially </i> </p>" +
                "<p class=\"hang12\">with the release of Letraset sheets containing Lorem Ipsum passages, <i>and more recently with desktop</i>. 1944.</p>" +
                "</body>";

            doc.LoadHtml(html);
            foreach (var item in doc.DocumentNode.Descendants())
            {
                chNodes(item);
            }

public void chNodes(HtmlAgilityPack.HtmlNode node)
        {
            try
            {
                if (node.HasChildNodes)
                {
                    foreach (var item in node.ChildNodes)
                    {
                        chNodes(item);
                    }
                }
                else
                {
                    Console.WriteLine("************");
                    Console.WriteLine(node.Line);
                    Console.WriteLine(node.LinePosition);
                    Console.WriteLine("************");
                }

            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.StackTrace);
                throw ex;
            }
        }

My code above get the first position of the opening tag found. 我上面的代码获得找到的开始标签的第一个位置。 But i can't get the position of the closing tag. 但是我无法获得结束标签的位置。 How can i solve it? 我该如何解决? I need those values to highlight the text in the webbrowser control. 我需要这些值来突出显示Web浏览器控件中的文本。 thank you. 谢谢。

you can get using following code try this 您可以使用以下代码尝试此

foreach (var item in doc.DocumentNode.SelectNodes("//p[@class='hang12']"))
{ 
     item.innerText;
     item.innerHtml; 
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM