Javascript - Global replace string between tags

Question

Could someone please help me with the regexp javascript code to replace all   tags with a newline "\\n" character found within <pre> divisions.. For example, a string passed to the function containing the following:

<pre class="exampleclass">1<br />2<br />3</pre>

Should be returned as (newlines not shown, though I hope you get the idea):

<pre class="exampleclass">1(newline)2(newline)3</pre>

Another example:

<div>foo<br />bar<pre>1<br />2</pre></div>

Returned as:

<div>foo<br />bar<pre>1(newline)2</pre></div>

Note that the class and division content is dynamic, along with other content in the string (other divs etc). On the other hand, the   tag does not change, so there's no need to cater for   or other variants.

NB - I'm working with strings, not HTML elements.. Just in case there is any confusion by the way I have presented the question.

Answer 1

You could use

str.match(/<pre(?:.*?)>(?:.*?)<\/pre>/g);

And then for all matches

replaced = match.replace(/<br \/>/g, '\n');
str.replace(match, replaced);

So probably something like this:

var matches = str.match(/<pre(?:.*?)>(?:.*?)<\/pre>/g),
    len = matches.length,
    i;

for (i = 0; i < len; i++) {
    str = str.replace(matches[i], matches[i].replace(/<br \/>/g, '\n'));
}

EDIT: changed to match <pre class=""> as well.

Answer 2

HAD it been a document then

var allPre = document.getElementsByTagName('pre');
for (var i=0,n=allPre.length;i<n;i++) {
   allPre[i].innerHTML=allPre[i].innerHTML.replace(/<br \/>/gi,"\n");
}

since   could be   in some innerHTML implementations

Have a look here too: Replace patterns that are inside delimiters using a regular expression call

Answer 3

You could use the DOM to do this and avoid trying to parse HTML with regex. However, this will leave you at the mercy of the browser's implementation of innerHTML . For example, IE will return tag names in upper case and will not necessarily close all tags.

See it in action: http://jsfiddle.net/timdown/KYRSU/

var preBrsToNewLine = (function() {
    function convert(node, insidePre) {
        if (insidePre && node.nodeType == 1 && node.nodeName == "BR") {
            node.parentNode.replaceChild(document.createTextNode("\n"), node);
        } else {
            insidePre = insidePre || (node.nodeType == 1 && node.nodeName == "PRE");
            for (var i = 0, children = node.childNodes, len = children.length; i < len; ++i) {
                convert(children[i], insidePre);
            }
        }
    }

    return function(str) {
        var div = document.createElement("div");
        div.innerHTML = str;
        convert(div, false);
        return div.innerHTML;
    }
})();

var str = "<div>foo<br />bar<pre>1<br />2</pre></div>";
window.alert(preBrsToNewLine(str));

Answer 4

I (and others) think its a bad idea to use regular expressions to parse html (or xml). You probably want to use a recursive state machine. Will something like this resolve the issue? There's a lot of room to optimize, but I think it illustrates.

function replace(input, pre) {
    var output = [];
    var tag = null;
    var tag_re = /<(\w+)[^>]*?(\/)?>/; // This is a bit simplistic and will have problems with > in attribute values
    while (tag_re.exec(input)) {
        output.push(RegExp.leftContext);
        input = RegExp.rightContext;
        tag = RegExp.$1;
        if (pre && tag == 'br') {
            output.push('\n');
        } else {
            output.push(RegExp.lastMatch);
        }

        if (!RegExp.$2) {
            // not a self closing tag
            output.push(replace(input, tag=='pre'));
            return output.join('');
        }
    }
    output.push(input);
    return output.join('');
}

Javascript - Global replace string between tags

Question

4 answers

solution1
4 ACCPTED 2010-12-16 12:42:30

solution2
0 2010-12-16 12:42:29

solution3
0 2010-12-16 14:40:08

solution4
0 2010-12-17 03:53:22

Javascript - Global replace string between tags

Question

4 answers

solution1 4 ACCPTED 2010-12-16 12:42:30

solution2 0 2010-12-16 12:42:29

solution3 0 2010-12-16 14:40:08

solution4 0 2010-12-17 03:53:22

solution1
4 ACCPTED 2010-12-16 12:42:30

solution2
0 2010-12-16 12:42:29

solution3
0 2010-12-16 14:40:08

solution4
0 2010-12-17 03:53:22