简体   繁体   English

C#:用正则表达式替换<font>HTML 中的</font>所有<font>标签</font>

[英]C#: Regular expression to replace all <font> tags in HTML by <span>

I would like to replace all <font> tags in a HTML file by <span style="..."> and retain the attributes such as font color and font size.我想用 <span style="..."> 替换 HTML 文件中的所有 <font> 标签,并保留字体颜色和字体大小等属性。

Here are the test cases:以下是测试用例:

 <font color='#000000'>Case 1</font><br /> <font size=6>Case 2</font><br /> <font color="red" size="12">Case 3</font>

Here is the expected result:这是预期的结果:

 <span style="color:#000000">Case 1</span><br /> <span style="font-size:6rem">Case 2</span><br /> <span style="color:red; font-size:12rem">Case 3</span>

With the C# code below, case 1 and 2 can be replaced successfully as they have only 1 style attribute.使用下面的 C# 代码,案例 1 和案例 2 可以成功替换,因为它们只有 1 个样式属性。 However, the second attribute in case 3 is missed.但是,遗漏了情况 3 中的第二个属性。 Is that possible to improve the C# code below for keeping both "color" and "size"?是否可以改进下面的 C# 代码以同时保留“颜色”和“大小”?

        string pattern = "<font (color|size)=(?:\"|'|)([a-z0-9#\\-]+)(?:\"|'|).*?>(.*?)<\\/font>";
        Regex regex = new Regex(pattern, RegexOptions.Singleline);

        output = regex.Replace(output, delegate (Match m) {
            string attr  = m.Groups[1].Value.Trim(); 
            string value = m.Groups[2].Value.Trim();
            string text  = m.Groups[3].Value.Trim();

            if (attr.Equals("size")) {
                attr = "font-size";
                value += "px";
            }

            return string.Format("<span style=\"{0}:{1};\">{2}</span>", attr, value, text);
        });

Thank you very much!非常感谢!

As commented by @Steve B Don't use regex.正如@Steve B 所评论的,不要使用正则表达式。 HTML has so many ways to write tags that you'll end with a monstrous regex. HTML 有很多编写标签的方法,你会以一个可怕的正则表达式结束。 My advise is to use HtmlAgilityPack which allows you to parse and manipulate HTML.我的建议是使用 HtmlAgilityPack,它允许您解析和操作 HTML。 This lib is a golden nuget when dealings with HTML manipulation.在处理 HTML 操作时,这个库是一个金块。 And it's free and open source.而且它是免费和开源的。

Here you can do this by using HtmlAgilityPack在这里您可以使用HtmlAgilityPack

public string ReplaceFontBySpan()
{
    HtmlDocument doc = new HtmlDocument();

    string htmlContent = @"<font color='#000000'>Case 1</font><br />
<font size=6>Case 2</font><br />
<font color='red' size='12'>Case 3</font>";

    doc.LoadHtml(htmlContent);

    foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//font"))
    {
        var attributes = node.Attributes;

        foreach (var item in attributes)
        {
            if (item.Name.Equals("size"))
            {
                item.Name = "font-size";
                item.Value = item.Value + "rem";
            }
        }

        var attributeValueList = node.Attributes.Select(x => x.Name + ":" + x.Value).ToList();

        string attributeName = "style";
        string attributeValue = string.Join(";", attributeValueList);


        HtmlNode span = doc.CreateElement("span");
        span.Attributes.Add(attributeName, attributeValue);
        span.InnerHtml = node.InnerHtml;

        node.ParentNode.ReplaceChild(span, node);
    }

    return doc.DocumentNode.OuterHtml;
}

Output:输出:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM