[英]C#: Regular expression to replace all <font> tags in HTML by <span>
I would like to replace all <font> tags in a HTML file by <span style="..."> and retain the attributes such as font color and font size.我想用 <span style="..."> 替换 HTML 文件中的所有 <font> 标签,并保留字体颜色和字体大小等属性。
Here are the test cases:以下是测试用例:
<font color='#000000'>Case 1</font><br /> <font size=6>Case 2</font><br /> <font color="red" size="12">Case 3</font>
Here is the expected result:这是预期的结果:
<span style="color:#000000">Case 1</span><br /> <span style="font-size:6rem">Case 2</span><br /> <span style="color:red; font-size:12rem">Case 3</span>
With the C# code below, case 1 and 2 can be replaced successfully as they have only 1 style attribute.使用下面的 C# 代码,案例 1 和案例 2 可以成功替换,因为它们只有 1 个样式属性。 However, the second attribute in case 3 is missed.
但是,遗漏了情况 3 中的第二个属性。 Is that possible to improve the C# code below for keeping both "color" and "size"?
是否可以改进下面的 C# 代码以同时保留“颜色”和“大小”?
string pattern = "<font (color|size)=(?:\"|'|)([a-z0-9#\\-]+)(?:\"|'|).*?>(.*?)<\\/font>";
Regex regex = new Regex(pattern, RegexOptions.Singleline);
output = regex.Replace(output, delegate (Match m) {
string attr = m.Groups[1].Value.Trim();
string value = m.Groups[2].Value.Trim();
string text = m.Groups[3].Value.Trim();
if (attr.Equals("size")) {
attr = "font-size";
value += "px";
}
return string.Format("<span style=\"{0}:{1};\">{2}</span>", attr, value, text);
});
Thank you very much!非常感谢!
As commented by @Steve B Don't use regex.
正如@Steve B 所评论的,不要使用正则表达式。 HTML has so many ways to write tags that you'll end with a monstrous regex.
HTML 有很多编写标签的方法,你会以一个可怕的正则表达式结束。 My advise is to use HtmlAgilityPack which allows you to parse and manipulate HTML.
我的建议是使用 HtmlAgilityPack,它允许您解析和操作 HTML。 This lib is a golden nuget when dealings with HTML manipulation.
在处理 HTML 操作时,这个库是一个金块。 And it's free and open source.
而且它是免费和开源的。
Here you can do this by using HtmlAgilityPack在这里您可以使用HtmlAgilityPack
public string ReplaceFontBySpan()
{
HtmlDocument doc = new HtmlDocument();
string htmlContent = @"<font color='#000000'>Case 1</font><br />
<font size=6>Case 2</font><br />
<font color='red' size='12'>Case 3</font>";
doc.LoadHtml(htmlContent);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//font"))
{
var attributes = node.Attributes;
foreach (var item in attributes)
{
if (item.Name.Equals("size"))
{
item.Name = "font-size";
item.Value = item.Value + "rem";
}
}
var attributeValueList = node.Attributes.Select(x => x.Name + ":" + x.Value).ToList();
string attributeName = "style";
string attributeValue = string.Join(";", attributeValueList);
HtmlNode span = doc.CreateElement("span");
span.Attributes.Add(attributeName, attributeValue);
span.InnerHtml = node.InnerHtml;
node.ParentNode.ReplaceChild(span, node);
}
return doc.DocumentNode.OuterHtml;
}
Output:输出:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.