简体   繁体   中英

Remove HTML tags and entites from string coming from server

In an app I receive some HTML text: since the app can't display (interpret) HTML, I need to remove any HTML tag and entity from the string I receive from the server.

I tried the following, but this one removes HTML tags but not entities (eg. &bnsp;):

stringFromServer.replace(/(<([^>]+)>)/ig,"");

Any help is appreciated.

Disclaimer: I need a pure JavaScript solution (no JQuery, Underscore, etc.).

[UPDATE] I'm reading all your answers now and I forgot to mention that I'm using JavaScript BUT the environment is not a web page, so I have no DOM .

You can try something like this:

var placeholder = document.createElement('div');
placeholder.innerHTML = stringFromServer;

var theText = placeholder.innerText;

.innerText only grabs text content from the element.

However, since it appears you don't have access to any DOM manipulation at all, you're probably going to have to use some kind of HTML parser, like these:
https://www.npmjs.org/package/htmlparser
http://ejohn.org/blog/pure-javascript-html-parser/

A solution without using regexes or phantom divs can be found on Mozilla's MDN .

I put the code in a JSfiddle here :

var sMyString = "<a id=\"a\"><b id=\"b\">hey!<\/b><\/a>";
var oParser = new DOMParser();
var oDOM = oParser.parseFromString(sMyString, "text/xml");
// print the name of the root element or error message
alert(oDOM.documentElement.nodeName == "parsererror" ?
       "error while parsing" : oDOM.documentElement.textContent);

Alternatively, parse the HTML snippet in a new document and do your dom manipulations from that (if you'd rather keep it separate from the current document):

var tmpDoc=document.implementation.createHTMLDocument("");
tmpDoc.body.innerHTML="<a href='#'>some text</a><p style=''> more text</p>";
tmpDoc.body.textContent;

tmpDoc.body.textContent evaluates to:

some text more text
stringFromServer.replace(/(<([^>]+)>|&[^;]+;)/ig, "")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM