简体   繁体   English

我如何从嵌套中获取文本 <p> 使用html敏捷包在外部html中标记?

[英]how do i get the text from a nested <p> tag in an external html using html agility pack?

I am trying to get some text from an external site. 我正在尝试从外部站点获取一些文本。 The text I am trying to get is nested in a paragraph tag. 我尝试获取的文本嵌套在段落标签中。 The div has has a class value div具有类值

html code snippet: html代码段:

<div class="discription"><p>this is the text I want to grab</p></div>

current c# code: 当前的C#代码:

public String getDiscription(string url)
{
    var web = new HtmlWeb();
    var doc = web.Load(url);


    var nodes = doc.DocumentNode.SelectNodes("//div[@class='discription']");

    if (nodes != null)
    {
        foreach (var node in nodes)
        {
            string Description = node.InnerHtml;
            return Description;
        }
    } else
      {
       string error = "could not find text";
       return error;
      }
}

what I dont understand is the syntax of the xpath //div[@class='discription'] I know it is wrong what should the xpath be? 我不理解的是xpath的语法//div[@class='discription']我知道xpath应该是错误的吗?

use //div[@class='discription']/p . 使用//div[@class='discription']/p

Breakdown: 分解:

//div                    - All div elements
[@class='discription']   - With a class attribute whose value is discription
/p                       - Select the child p elements

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM