How can I change all instances of a string within an HTML document via JS without disturbing its markup?

Question

I want to write a script that will cause a page to "decay", changing characters at random. Let's say I've got some HTML that looks like this:

<div class="eebee">
    Lorem Ipsum <a href="http://example.com">Anchor here</a>
</div>

And I want to replace every instance of "e" with "∑" so it'll be

<div class="eebee">
    Lor∑m Ipsum <a href="http://example.com">Anchor h∑r∑</a>
</div>

but, obviously, I don't want it to be

<div class="∑∑b∑∑">
    Lor∑m Ipsum <a hr∑f="http://∑xample.com">Anchor h∑r∑</a>
</div>

How to accomplish this? A DOM parser? Or just some regex, searching for content betw∑∑n ">" and "<"?

EDIT: As per Oriol's solution below, to put it into a function that accepts any find-and-replace strings:

function decay(find_string, replace_string) {
    var treeWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT);
    while(treeWalker.nextNode()) {
        var node = treeWalker.currentNode;
        re = new RegExp(find_string, "g");
        node.nodeValue = node.nodeValue.replace(re, replace_string);
    }
}

Answer 1

You can write a simple recursive function which iterates all text nodes:

 function iterateTextNodes(root, callback) { for(var i=0; i<root.childNodes.length; ++i) { var child = root.childNodes[i]; if(child.nodeType === 1) { // element node iterateTextNodes(child, callback); // recursive call } else if(child.nodeType === 3) { // text node callback(child); // pass it to callback } } } iterateTextNodes(document.body, function(node) { node.nodeValue = node.nodeValue.replace(/e/g, '∑'); });

 <div class="eebee">Lorem Ipsum <a href="http://example.com">Anchor here</a></div>

Or if you prefer a built-in way, you can use a tree walker

 var treeWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT); while(treeWalker.nextNode()) { var node = treeWalker.currentNode; node.nodeValue = node.nodeValue.replace(/e/g, '∑'); }

 <div class="eebee">Lorem Ipsum <a href="http://example.com">Anchor here</a></div>

Some notes:

Don't use a HTML parser. If you parse a HTML string and then replace the old DOM tree with the new one, you will remove all the internal data of elements (event listeners, checkedness, ...)
Especially, never use a regex to parse HTML. You can't parse (X)HTML with regex.

How can I change all instances of a string within an HTML document via JS without disturbing its markup?

Question

1 answers

solution1
3 ACCPTED 2016-05-02 02:25:38

How can I change all instances of a string within an HTML document via JS without disturbing its markup?

Question

1 answers

solution1 3 ACCPTED 2016-05-02 02:25:38

solution1
3 ACCPTED 2016-05-02 02:25:38