简体   繁体   English

使用C#通过div中的内容获取div类

[英]get div class by content inside div using C#

I need to identify the class of a div element which contains some text. 我需要确定包含一些文本的div元素的类。 For example I have this HTML page 例如我有这个HTML页面

<html>
    ...
    <div class='x'>
        <p>this is the text I have.</p>
        <p>Another part of text.</p>
    </div>
    ...
</html>

So I know the text this is the text I have. Another part of text. 所以我知道this is the text I have. Another part of text. this is the text I have. Another part of text. And I need to identify the div class name. 而且我需要确定div类的名称。 Is there an way to do this using C#? 有没有办法使用C#做到这一点?

Try this: 尝试这个:

string stringToSearch = "<p>this is the text I have.</p><p>Another part of text.</p>";
HtmlDocument document = new HtmlDocument();
document.LoadHtml(sb.ToString());

var classOfDiv = document.DocumentNode.Descendants("div").Select(x => new
{
    ClassOfDiv = x.Attributes["class"].Value
}).Where(x => x.InnerHtml = stringToSearch);

The variable classOfDiv now contains the class name of the desired div . 变量classOfDiv现在包含所需divclass名称。

Building on the answer of diiN_. 建立在diiN_的答案上。 This is a bit verbose but you should be able to get what you need from it. 这有点冗长,但是您应该能够从中获得所需的东西。 The code depends on the HTML Agility Pack . 该代码取决于HTML Agility Pack You can get it using nuget. 您可以使用nuget获取它。

var sb = new StringBuilder();
sb.AppendFormat("<html>");
sb.AppendFormat("<div class='x'>");
sb.AppendFormat("<p>this is the text I have.</p>");
sb.AppendFormat("<p>Another part of text.</p>");
sb.AppendFormat("</div>");
sb.AppendFormat("</html>");

const string stringToSearch = "<p>this is the text I have.</p><p>Another part of text.</p>";

var document = new HtmlDocument();
document.LoadHtml(sb.ToString());

var divsWithText = document
    .DocumentNode
    .Descendants("div")
    .Where(node => node.Descendants()
                       .Any(des => des.NodeType == HtmlNodeType.Text))
    .ToList();

var divsWithInnerHtmlMatching =
    divsWithText
        .Where(div => div.InnerHtml.Equals(stringToSearch))
        .ToList();

var innerHtmlAndClass =
    divsWithInnerHtmlMatching
        .Select(div => 
            new
            {
                InnerHtml = div.InnerHtml,
                Class = div.Attributes["class"].Value
            });

foreach (var item in innerHtmlAndClass)
{
Console.WriteLine("class='{0}' innerHtml='{1}'", item.Class, item.InnerHtml);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM