简体   繁体   中英

Javascript RegEx replace all characters not within HTML tags

Looking for a bit of help, my regex is a bit rusty...

I'm trying to replace all characters not within HTML tags in javascript by a character.

For example replace those characters by a dash "-",

<div class="test">Lorem Ipsum <br/> Dolor Sit Amet</div>

Would be replaced by:

<div class="test">------------<br/>--------------</div>

So I'm looking for

str.replace(/YourMagicalRegEx/g, '-');

Please help, I get how to return text not within html tags with regex, text within html tags with regex, but all characters not within html tags seems quite tricky...!

Additional Challenge: Must be IE7 and up compatible.

Using jQuery:

html = '<div class="test">Lorem Ipsum <br/> Dolor Sit Amet</div>';
node = $("<div>" + html + "</div>");
node.find('*').contents().each(function() {
    if(this.nodeType == 3)
        this.nodeValue = Array(this.nodeValue.length).join('-')

});
console.log(node.html())

(I don't have IE7 at hand, let me know if this works).

If you prefer regular expressions, it goes like this:

html = html.replace(/<[^<>]+>|./g, function($0) {
    return $0[0] == '<' ? $0 : '-';
});

Basically, we replace tags with themselves and out-of-tags characters with dashes.

Instead of using a regex-only approach, you can find all text nodes within the document and replace their content with hyphens.

Using the TreeWalker API:

 var tree = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT);

 while (tree.nextNode()) {
     var textNode = tree.currentNode;
     textNode.nodeValue = textNode.nodeValue.replace(/./g, '-');
 }

A recursive solution:

function findTextNodes(node, fn){
  for (node = node.firstChild; node;node=node.nextSibling){
    if (node.nodeType === Node.TEXT_NODE) fn(node);
    else if(node.nodeType === Node.ELEMENT_NODE && node.nodeName !== 'SCRIPT') findTextNodes(node, fn);
  }
}


findTextNodes(document.body, function (node) {
  node.nodeValue = node.nodeValue.replace(/./g, '-');
});

The predicate node.nodeName !== 'SCRIPT' is required to prevent the function from replacing any script content within the body.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM