从HTTP请求中删除HTML节点

Question

I have some HTML code stored into a string variable, resulting from a HttpWebRequest : 我有一些HTML代码存储在字符串变量中，这是由HttpWebRequest产生的：

<html>
  <head>
    <div>Lots of scripts and libraries</div>
  </head>
  <body>
    <div>Some very useful data</div>
  </body>
  <footer>
    <div>Not interesting struff</div>
  </footer>
<html>

How can I do to remove all unecesary nodes and get into this: 我该如何删除所有不必要的节点并进入该节点：

<body>
  <div>Some very useful data</div>
</body>

Answer 1

The easiest way is to use HtmlAgilityPack to grab just the body tag. 最简单的方法是使用HtmlAgilityPack抓取body标签。

var document = new HtmlAgilityPack.HtmlDocument();
document.LoadHtml(html);

HtmlNode body = document.DocumentNode.SelectSingleNode("//body");

From there, you can use HtmlAgilityPack to further parse the body node for more detail. 从那里，您可以使用HtmlAgilityPack进一步解析body节点以获取更多详细信息。

从HTTP请求中删除HTML节点

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-06-08 01:16:13

从HTTP请求中删除HTML节点

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-06-08 01:16:13

解决方案1
3 已采纳 2016-06-08 01:16:13