简体   繁体   English

使用正则表达式替换块中的特殊字符

[英]replace special characters within a block using regex

I need to replace all < and > between [code] block. 我需要替换[code]块之间的所有<和>。 I DO NOT want to select and replace all content within [code] I just want to select < and > within that and then temporary replace it to another characters. 我不想选择并替换[code]中的所有内容,我只想选择其中的<和>,然后将其临时替换为另一个字符。 do other replacement and then back them to < > within [code]. 进行其他替换,然后将它们返回到[code]中的<>。

solution that I use: 我使用的解决方案:

replace(/<(?=[^\[]*\[\/code\])/gi,"&_lt_;"); 
replace(/>(?=[^\[]*\[\/code\])/gi,"&_gt_;"); 

DO OTHER REPLACEMENT/CUSTOMIZATION HERE 

replace(/&_lt_;/gi,"<"); 
replace(/&_gt_;/gi,">"); 

only problem is that if content between [code] contain character [ it do not work before that character in block. 唯一的问题是,如果[code]之间的内容包含字符[,则它在块中该字符之前不起作用。 how can I fix this? 我怎样才能解决这个问题?

example that works: 有效的示例:

<b>
[code]
<form action="nd.php" method="post">
<b>
<strong>
[/code]
<b>

example that do not works: 无效的示例:

<b>
[code]
<form action="nd.php" method="post">
<b>
$_POST[
<strong>
[/code]
<b>

EDIT: please only provide simple regex replace solution. 编辑:请仅提供简单的正则表达式替换解决方案。 I can not use callback function for this issue. 我不能为此问题使用回调函数。

The accepted-answer for the linked question doesn't work for me for the " example that works ". 对于“有效的示例”,链接问题的已接受答案对我不起作用。 However, the other answer does - it also works for the " example that does not work " (there was a typo though). 但是,另一个答案也可以 -它也适用于“ 示例不起作用 ”(尽管有错字)。

Try the following regex: 尝试以下正则表达式:

/(\[code\][\s\S]*?\[\/code\])|<[\s\S]*?>/g

In the replace() function, you would use: replace()函数中,您将使用:

.replace(/(\[code\][\s\S]*?\[\/code\])|<[\s\S]*?>/g, '$1'); 

EDIT 编辑
If I understand correctly, your end-goal is to keep all of the content within [code][/code] the same - but be able to do replacements on all HTML tags that are outside of these tags (which may or may not mean to fully strip the characters)? 如果我理解正确,您的最终目标是使[code][/code]的所有内容都相同-但能够替换这些标签之外的所有HTML标签(这可能意味着或可能并不意味着完全去除字符)?

If this is the case, there is no need for a long list of regexes; 如果是这种情况,则不需要很长的正则表达式列表。 The above regex can be used (with a slight modification) and it can cover many cases. 可以使用上面的正则表达式(稍作修改),它可以涵盖许多情况。 Combine the regex/replace with a callback function to handle your extra replacements: 将正则表达式/替换与回调函数结合使用以处理您的额外替换:

var replaceCallback = function(match) {
    // if the match's first characters are '[code]', we have a '[code][/code]' block
    if (match.substring(0, 6) == '[code]') {
        // do any special replacements on this block; by default, return it untouched
        return match;
    }
    // the match you now have is an HTML tag; it can be `<tag>` or `</tag>`
    // do any special replacements; by default, return an empty string
    return '';
}

str = str.replace(/(\[code\][\s\S]*?\[\/code\])|(<[\s\S]*?>)/g, replaceCallback);

The one regex modification was to add a group around the html-tag section (the second part of the regex). 一个正则表达式修改是在html-tag部分(正则表达式的第二部分)周围添加一个组。 This will allow it to be passed to the callback function. 这将允许它传递给回调函数。

UPDATE ( [code] isn't literal) UPDATE[code]不是文字)
Per a comment, I've realized that the tag [code] isn't literal - you want to cover all BBCode style tags. 通过评论,我意识到标记[code]不是文字的-您想覆盖所有BBCode样式的标记。 This is just-as-easy as the above example (even easier in the callback). 就像上面的示例一样简单(在回调中更容易)。 Instead of the word code in the regex, you can use [az]+ to cover all alphabetical characters. 您可以使用[az]+代替所有正则表达式来代替正则表达式中的单词code Then, inside the callback you can just check the very first character; 然后,在回调内部,您只需检查第一个字符即可; if it's a [ , you're in a code block - otherwise you have an HTML tag that's outside a code block: 如果是[ ,则说明您在代码块中-否则,您有一个HTML代码位于代码块之外:

var replaceCallback = function(match) {
    // if the match's first character is '[', we have a '[code][/code]' block
    if (match.substring(0, 1) == '[') {
        // do any special replacements on this block; by default, return it untouched
        return match;
    }
    // the match you now have is an HTML tag; it can be `<tag>` or `</tag>`
    // do any special replacements; by default, return an empty string
    return '';
}

str = str.replace(/(\[[a-z]+\][\s\S]*?\[\/[a-z]+\])|(<[\s\S]*?>)/gi, replaceCallback);

Also note that I added an i to the regex's options to ignore case (otherwise you'll need [a-zA-Z] to handle capital letters). 另请注意,我在正则表达式的选项中添加了一个i以忽略大小写(否则,您需要[a-zA-Z]来处理大写字母)。

Here's my edited answer. 这是我编辑的答案。 Sorry again. 再次抱歉。

str = str.replace(/(\[code\])(.*?)(\[\/code\])/gm,function(a,b,c,d) {
    return b + c.replace(/</g,'&lt;').replace(/>/g,'&gt;') + d;
});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM