[英]How do I use HTML Agility Pack to edit an HTML snippet
So I have an HTML snippet that I want to modify using C#. 所以我有一个HTML代码片段,我想用C#修改。
<div>
This is a specialSearchWord that I want to link to
<img src="anImage.jpg" />
<a href="foo.htm">A hyperlink</a>
Some more text and that specialSearchWord again.
</div>
and I want to transform it to this: 我想把它转换成这个:
<div>
This is a <a class="special" href="http://mysite.com/search/specialSearchWord">specialSearchWord</a> that I want to link to
<img src="anImage.jpg" />
<a href="foo.htm">A hyperlink</a>
Some more text and that <a class="special" href="http://mysite.com/search/specialSearchWord">specialSearchWord</a> again.
</div>
I'm going to use HTML Agility Pack based on the many recommendations here, but I don't know where I'm going. 我将根据这里的许多建议使用HTML Agility Pack,但我不知道我要去哪里。 In particular, 尤其是,
InnerHtml
property directly (or Text
on text nodes) or modifying the dom tree by using eg AppendChild
, PrependChild
etc. 有两个选项:您可以直接编辑InnerHtml
属性(或Text
节点上的文本)或使用例如AppendChild
, PrependChild
等修改dom树。 HtmlDocument.DocumentNode.OuterHtml
property or use HtmlDocument.Save
method (personally I prefer the second option). 您可以使用HtmlDocument.DocumentNode.OuterHtml
属性或使用HtmlDocument.Save
方法(我个人更喜欢第二个选项)。 As to parsing, I select the text nodes which contain the search term inside your div
, and then just use string.Replace
method to replace it: 至于解析,我选择在div
包含搜索词的文本节点,然后使用string.Replace
方法替换它:
var doc = new HtmlDocument();
doc.LoadHtml(html);
var textNodes = doc.DocumentNode.SelectNodes("/div/text()[contains(.,'specialSearchWord')]");
if (textNodes != null)
foreach (HtmlTextNode node in textNodes)
node.Text = node.Text.Replace("specialSearchWord", "<a class='special' href='http://mysite.com/search/specialSearchWord'>specialSearchWord</a>");
And saving the result to a string: 并将结果保存为字符串:
string result = null;
using (StringWriter writer = new StringWriter())
{
doc.Save(writer);
result = writer.ToString();
}
Answers: 回答:
Note that your Xpath expression may need to be more complex to find the div that you want. 请注意,您的Xpath表达式可能需要更复杂才能找到所需的div。
HtmlDocument doc = new HtmlDocument();
doc.Load(yourHtmlFile);
HtmlNode divNode = doc.DocumentNode.SelectSingleNode("//div[2]");
string newDiv = Regex.Replace(divNode.InnerHtml, @"specialSearchWord",
"<a class='special' href='http://etc'>specialSearchWord</a>");
divNode.InnerHtml = newDiv;
Console.WriteLine(doc.DocumentNode.OuterHtml);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.