简体   繁体   English

如何在html标签之间替换内容而不替换标签本身

[英]How to replace content between html tags without replacing the tags themselves

Suppose I have a string like this: 假设我有一个像这样的字符串:

<code>Blah blah Blah
enter code here</code>
<code class="lol">enter code here
fghfgh</code>

I want to use javascript to replace all occurences between the <code> tags with a callback function that html encodes it. 我想使用javascript用HTML对其进行编码的回调函数替换<code>标记之间的所有匹配项。

This is what I have currently: 这是我目前拥有的:

function code_parsing(data){
    //Dont escape & because we need that... in case we deliberately write them in
    var escape_html = function(data, p1, p2, p3, p4) {
        return p1.replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;").replace(/'/g, "&#039;");
    };

    data = data.replace(/<code[^>]*>([\s\S]*?)<\/code>/gm, escape_html);
        // \[start\](.*?)\[end\]
        return data;        
    };

This function is unfortunately removing "<code>" tags and replacing them with just the content. 不幸的是,此功能是删除"<code>"标记并将其仅替换为内容。 I would like to keep the <code> tags with any number of attributes. 我想将<code>标记保留为任意数量的属性。 If I just hardcode the <code> tag back into it, I will lose the attributes. 如果仅将<code>标记硬编码回去,则会丢失属性。

I know regex isn't the best tool, but there won't be any nested elements in it. 我知道regex不是最好的工具, 但是其中没有嵌套的元素。

You shouldn't use regular expressions to parse HTML. 您不应该使用正则表达式来解析HTML。

That said, you need to capture the content you want to preserve using a parenthetical group and have your replacer append that to the bit you manipulate. 就是说,您需要使用括号组捕获要保留的内容,并将替换器附加到要操作的位上。

data.replace(/(<code[^>]*>)([\s\S]*?)(<\/code>)/g,
             function (_, startTag, body, endTag) {
               return startTag + escapeHtml(body) + endTag;
             })

To understand why you shouldn't use regular expressions to parse HTML, consider what this does to 要了解为什么不应该使用正则表达式解析HTML,请考虑这样做

<code title="Shows how to tell whether x > y">if (x &gt; y) { ... }</code>

<code lang="js">node.style.color = "<code lang="css">#ff0000</code>"</code>

<code>foo</CODE >

<textarea><code>My HTML code goes here</code></textarea>

<code>foo  <!-- commented out </code> --></code>

Simple solution: In your escape_html function, after the operation is done on the string, but BEFORE your return it, append and prepend your tags to the string and return the full thing. 简单的解决方案:在您的escape_html函数中,在对字符串执行完操作之后,但是在返回字符串之前,请在字符串的前面附加和添加标签,然后返回完整内容。

Sometimes the simplest answer is the best :) 有时最简单的答案是最好的:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM