I would like to replace all <font> tags in a HTML file by <span style="..."> and retain the attributes such as font color and font size.
Here are the test cases:
<font color='#000000'>Case 1</font><br /> <font size=6>Case 2</font><br /> <font color="red" size="12">Case 3</font>
Here is the expected result:
<span style="color:#000000">Case 1</span><br /> <span style="font-size:6rem">Case 2</span><br /> <span style="color:red; font-size:12rem">Case 3</span>
With the C# code below, case 1 and 2 can be replaced successfully as they have only 1 style attribute. However, the second attribute in case 3 is missed. Is that possible to improve the C# code below for keeping both "color" and "size"?
string pattern = "<font (color|size)=(?:\"|'|)([a-z0-9#\\-]+)(?:\"|'|).*?>(.*?)<\\/font>";
Regex regex = new Regex(pattern, RegexOptions.Singleline);
output = regex.Replace(output, delegate (Match m) {
string attr = m.Groups[1].Value.Trim();
string value = m.Groups[2].Value.Trim();
string text = m.Groups[3].Value.Trim();
if (attr.Equals("size")) {
attr = "font-size";
value += "px";
}
return string.Format("<span style=\"{0}:{1};\">{2}</span>", attr, value, text);
});
Thank you very much!
As commented by @Steve B Don't use regex. HTML has so many ways to write tags that you'll end with a monstrous regex. My advise is to use HtmlAgilityPack which allows you to parse and manipulate HTML. This lib is a golden nuget when dealings with HTML manipulation. And it's free and open source.
Here you can do this by using HtmlAgilityPack
public string ReplaceFontBySpan()
{
HtmlDocument doc = new HtmlDocument();
string htmlContent = @"<font color='#000000'>Case 1</font><br />
<font size=6>Case 2</font><br />
<font color='red' size='12'>Case 3</font>";
doc.LoadHtml(htmlContent);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//font"))
{
var attributes = node.Attributes;
foreach (var item in attributes)
{
if (item.Name.Equals("size"))
{
item.Name = "font-size";
item.Value = item.Value + "rem";
}
}
var attributeValueList = node.Attributes.Select(x => x.Name + ":" + x.Value).ToList();
string attributeName = "style";
string attributeValue = string.Join(";", attributeValueList);
HtmlNode span = doc.CreateElement("span");
span.Attributes.Add(attributeName, attributeValue);
span.InnerHtml = node.InnerHtml;
node.ParentNode.ReplaceChild(span, node);
}
return doc.DocumentNode.OuterHtml;
}
Output:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.