简体   繁体   English

使用正则表达式替换html标签之外的特殊字符

[英]Using regular expression to replace special characters outside of html tags

I'm trying to find and replace some special html entities, ie '&' converts to & 我正在尝试查找并替换一些特殊的html实体,即'&'转换为& and '>' converts to > 和'>'转换为> . This is for an email builder tool, and some older clients need characters replacing with html entities. 这是用于电子邮件构建器工具,某些较旧的客户端需要将字符替换为html实体。

The user passes through a string, and I use javascript to loop through an array of objects. 用户通过字符串传递,而我使用javascript遍历对象数组。 This finds a character and replaces it with the correct html entity. 这将找到一个字符并将其替换为正确的html实体。

You can see the regex code I'm using here: 您可以在这里查看我使用的正则表达式代码:

https://regex101.com/r/WZh5tA/2 https://regex101.com/r/WZh5tA/2

    escapeCharacter: function(string){
      var replaceChar = [
        {reg : '&', replace: '&'},
        {reg : '"', replace: '"'},
        {reg : '£', replace: '£'},
        {reg : '€', replace: '€'},
        {reg : 'é', replace: 'é'},
        {reg : '–', replace: '–'},
        {reg : '®', replace: '®'},
        {reg : '™', replace: '™'},
        {reg : '‘', replace: '‘'},
        {reg : '’', replace: '’'},
        {reg : '“', replace: '“'},
        {reg : '”', replace: '”'},
        {reg : '#', replace: '#'},
        {reg : '©', replace: '©'},
        {reg : '@', replace: '@'},
        {reg : '$', replace: '$'},
        {reg : '\\(', replace: '('},
        {reg : '\\)', replace: ')'},
        {reg : '<', replace: '&lt;'},
        {reg : '>', replace: '&gt;'},
        {reg : '…', replace: '&hellip;'},
        {reg : '-', replace: '&#45;'},
        {reg : "'", replace: '&#39;'},
        {reg : '\\*', replace: '&#42;'},
        {reg : ',', replace: '&sbquo;'}
    ];
    var s = string;
    replaceChar.forEach(function(obj){
      var regEx = new RegExp(obj.reg+"(?!([^<]+)?>)", "g");
      s = s.replace(regEx, obj.replace);
    });

    return s
  }

The problem occurs when the user passes a string with html tags (which they should be allowed to do). 当用户传递带有html标记的字符串(应允许这样做)时,就会发生此问题。 For example, the string could be: 例如,字符串可以是:

'This is an example of some <b>bold</b> text'

My find and replace tool works it's magic, but I think I'm missing something because I get this output: 我的查找和替换工具很神奇,但是我想我丢失了一些东西,因为我得到了以下输出:

'This is an example of some <b>bold</b&gt; text'

You may use 您可以使用

s = s.replace(
      new RegExp("(<[^<>]*>)|" + obj.reg.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), "g"), 
          function ($0, $1) { return $1 ? $0 : obj.replace } 
);

Notes: 笔记:

  • You need to escape the obj.reg before using in a regex expression, hence .replace(/[-\\/\\\\^$*+?.()|[\\]{}]/g, '\\\\$&') is required 在用于正则表达式之前,需要先转义 obj.reg ,因此.replace(/[-\\/\\\\^$*+?.()|[\\]{}]/g, '\\\\$&')为必填项
  • The (<[^<>]*>)| (<[^<>]*>)| alternative matches and captures into Group 1 <...> substrings before the required matches and in the callback method passed as the replacement argument, there is a check if the first group matched. 可选匹配,并在所需匹配之前捕获到组1 <...>子字符串中,并在作为替换参数传递的回调方法中检查第一个组是否匹配。 If it did, the whole match is returned back as is, else, the replacement occurs. 如果匹配,则将整个匹配照原样返回,否则将进行替换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM