i have a string with an html code. i want to remove all html tags. so all characters between < and >.
This is my code snipped:
WebClient wClient = new WebClient();
SourceCode = wClient.DownloadString( txtSourceURL.Text );
txtSourceCode.Text = SourceCode;
//remove here all between "<" and ">"
txtSourceCodeFormatted.Text = SourceCode;
hope somebody can help me
Try this:
txtSourceCodeFormatted.Text = Regex.Replace(SourceCode, "<.*?>", string.Empty);
But, as others have mentioned, handle with care .
According to Ravi's answer , you can use
string noHTML = Regex.Replace(inputHTML, @"<[^>]+>| ", "").Trim();
or
string noHTMLNormalised = Regex.Replace(noHTML, @"\s{2,}", " ");
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.