简体   繁体   English

使用C#获取“ in”类中div的内容

[英]Use C# to get content of div with class “in”

How to get content of div with class in or more using C#? 如何使用C#或更高级别获取class indiv内容?

I have the following HTML code: 我有以下HTML代码:

<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta charset="utf-8" />
    <title></title>
</head>
<body>
    <div id="xxx">
        <div class="in">
            <a href="/a/show/7184569" class="mm">ВАЗ 2121</a> <span class="for">за</span>
            <span class="price">2 700 $</span>
            <br />
            <span class="year">1990 г.</span><br />
            <div style="margin: 3px 0 3px 0">contentxxx</div>
        </div>
    </div>
</body>
</html>

I want to get content of div class="in" and the result is: 我想获取div class="in"内容,结果是:

<div class="in">
     <a href="/a/show/7184569" class="mm">ВАЗ 2121</a> <span class="for">за</span>
     <span class="price">2 700 $</span>
     <br />
     <span class="year">1990 г.</span><br />
     <div style="margin: 3px 0 3px 0">contentxxx</div>
</div>
using HtmlAgilityPack;

static void Parse
        {


            HtmlWeb web = new HtmlWeb();
            HtmlDocument doc = new HtmlDocument();
            doc.LoadHtml(getHTML());

            HtmlNodeCollection nodeCol = doc.DocumentNode.SelectNodes("//div[@class=\"in\"]");

            string value = nodeCol[0].InnerHtml;
        }

        static string getHTML()
        {
            string retVal = "";

            retVal = @"<!DOCTYPE html>"
                     + "<html lang=\"en\" xmlns=\"http://www.w3.org/1999/xhtml\">"
                    + "<head>"
                        + "<meta charset=\"utf-8\" />"
                        + "<title></title>"
                    + "</head>"
                    + "<body>"
                        + "<div id=\"xxx\">"
                            + "<div class=\"in\">"
                                + "<a href=\"/a/show/7184569\" class=\"mm\">ВАЗ 2121</a> <span class=\"for\">за</span>"
                                + "<span class=\"price\">2 700 $</span>"
                                + "<br />"
                                + "<span class=\"year\">1990 г.</span><br />"
                                + "<div style=\"margin: 3px 0 3px 0\">contentxxx</div>"

                            + "</div>"
                        + "</div>"
                    + "</body>"
                    + "</html>";

            return retVal;
        }

Please add namespace HtmlAgilityPack; 请添加名称空间HtmlAgilityPack; ref : http://htmlagilitypack.codeplex.com/releases/view/90925 参考: http : //htmlagilitypack.codeplex.com/releases/view/90925

You can easily do it using HTML Agility Pack : 您可以使用HTML Agility Pack轻松完成此操作:

using HtmlAgilityPack;

...
var doc = new HtmlDocument();
doc.Load(@"C:\file.htm") //see the overloads. You can also use `LoadHtml` method.

var node = doc.DocumentNode.SelecSingleNode("//div[@class='in']");

//This is the text you are looking for...
var result = node.OuterHtml;

Use JQuery to get content of the div: 使用JQuery获取div的内容:

<script language="text/javascript">

       var d = $('div.in').html();
</script>

Above code get content of the div which has in class on it. 上面的代码获得它具有DIV的内容in就可以了类。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM