Javascript regex : remove text between HTML tags

Question

i want to remove text that is between any HTML tags :

example :

<div>
   <h1>Title</h1>
</div>

my var result should be :

<div>
    <h1></h1>
</div>

Answer 1

If, as your question suggests, you want to remove all text from between any HTML tags… only the real DOM is going to cut it.

function removeAllTextNodes(node) {
    if (node.nodeType === 3) {
        node.parentNode.removeChild(node);
    } else if (node.childNodes) {
        for (var i = node.childNodes.length; i--;) {
            removeAllTextNodes(node.childNodes[i]);
        }
    }
}

This, unlike textContent and innerHTML , will keep all existing element structure in place and remove only text.

If you really have a string and are using client-side JavaScript in a browser, and the string represents part of a document's content (and not an entire document – ie you won't find any DTD, <html> , <head> , or <body> elements within), then you can parse it just by putting it into an element:

var container = document.createElement("div");
container.innerHTML = htmlString;
removeAllTextNodes(container);
return container.innerHTML;

Otherwise, you'll probably want an HTML parser for JavaScript. Regular expressions, as it's been noted, aren't great at parsing HTML.

Answer 2

VANILLA JS TO THE RESCUE

var x = document.getElementsByTagName("h1");
for (var i=0; i<x.length; i++) {
    x[i].innerHTML = "";
}

Just insert any tag you'd like and wallah, no need for regex, or a 90kb library.

Answer 3

Javascript is already able to accomplish this with built in functions in a way that in conceptually superior to regex

<div>
   <h1 id="foo">Title</h1>
</div>
<script>
   document.getElementById("foo").textContent = ""
</script>

Answer 4

You would probably want to do something like this;

var elements = document.getElementsByTagName('*');
for(var i = 0; i < elements.length; i++) {
    var element = elements[i];
    if(element.children.length === 0) {
        elements[i].textContent = '';
    }
}

This

Finds all elements
Loops through them
Removes any text content

Docs:

You can also make this re-usable like so

var removeAllText = function() {
    var elements = document.getElementsByTagName('*');
    for(var i = 0; i < elements.length; i++) {
        var element = elements[i];
        if(element.children.length === 0) {
            elements[i].textContent = '';
        }
    }
}

Then whenever you want you can do this

removeAllText();

Answer 5

Don't use regex. Use something like loadXMLDoc() to parse the DOM and print the tags, instead of trying to remove the values from within the tags.

Answer 6

测试了我的 JS 并为我工作：

String.replace(/<yourtag>[\s\S]*<\/yourtag>/g, "");

Javascript regex : remove text between HTML tags

Question

6 answers

solution1
6 2014-01-06 18:20:53

solution2
4 2014-01-06 17:53:03

solution3
3 2014-01-06 17:43:14

solution4
2 2014-01-06 17:56:50

solution5
0 2014-01-06 17:40:54

solution6
0 2021-01-12 11:36:32

Javascript regex : remove text between HTML tags

Question

6 answers

solution1 6 2014-01-06 18:20:53

solution2 4 2014-01-06 17:53:03

solution3 3 2014-01-06 17:43:14

solution4 2 2014-01-06 17:56:50

solution5 0 2014-01-06 17:40:54

solution6 0 2021-01-12 11:36:32

solution1
6 2014-01-06 18:20:53

solution2
4 2014-01-06 17:53:03

solution3
3 2014-01-06 17:43:14

solution4
2 2014-01-06 17:56:50

solution5
0 2014-01-06 17:40:54

solution6
0 2021-01-12 11:36:32