简体   繁体   中英

Removing HTML from a string

I have a table (a Wijmo Grid). The column Log takes some text.

The user is allowed to write HTML in the text, because the same text is also used when mailed to make it look pretty and well styled.

Let's say the text is:

var text = "Hello friend <br> How are you? <h1> from me </h1>";

Is there any method or JSON.stringify() og HTML.enocde() i can/should use to get:

var textWithoutHtml = magic(text); // "Hello friend How are you? from me"

One of the problems is that if the text include "<br>" it break to next line i the row of the table, and it's possible to see the top-half of the second line in the row, witch doesn't look good.

var text = "Hello friend <br> How are you? <h1> from me </h1>";
var newText = text.replace(/(<([^>]+)>)/ig, "");

fiddle: http://jsfiddle.net/EfRs6/

You may try like this:

string s = Regex.Replace("Hello friend <br> How are you? <h1> from me </h1>", @"<[^>]+>|&nbsp;", "").Trim();

You can also check the HTML Agility Pack

This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).

<[^>]+>|&nbsp;/
1st Alternative: <[^>]+>
< matches the characters < literally
[^>]+ match a single character not present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
> a single character in the list > literally (case sensitive)
> matches the characters > literally
2nd Alternative: &nbsp;
&nbsp; matches the characters &nbsp; literally (case sensitive)

As far as i understood your question you can encode the values like this in C#

string encodedValue= HttpUtility.HtmlEncode(txtInput.Text);

Note: here txtInput is the id of TextBox on your page.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM