简体   繁体   中英

javascript remove html tags but not content within them and not <a > tags with regex

how to remove all tags in a string but not <a> ? and not the text inside them?

For example: <em>Bold</em><a>Go here</a> should be: Bold<a>Go here</a>

You can remove all strings that look like <...> other than <a> or </a> with

<(?!\/?a>)[^>]*>

See demo

Do not forget to add a /i case insensitive modifier to also avoid matching <A> . If you do not plan to keep closing </a> , you can use <(?!a>)[^>]*> .

Try this:

function strip_tags(input, allowed) {
  allowed = (((allowed || '') + '')
    .toLowerCase()
    .match(/<[a-z][a-z0-9]*>/g) || [])
    .join(''); // making sure the allowed arg is a string containing only tags in lowercase (<a><b><c>)
  var tags = /<\/?([a-z][a-z0-9]*)\b[^>]*>/gi,
    commentsAndPhpTags = /<!--[\s\S]*?-->|<\?(?:php)?[\s\S]*?\?>/gi;
  return input.replace(commentsAndPhpTags, '')
    .replace(tags, function($0, $1) {
      return allowed.indexOf('<' + $1.toLowerCase() + '>') > -1 ? $0 : '';
    });
}

var html = 'some html code';
html = strip_tags(html, '<a>');

source: http://phpjs.org/functions/strip_tags/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM