简体   繁体   English

删除特定的HTML标记和非ASCII字符

[英]Remove specific HTML tags and non-ASCII characters

How can I remove <table>, <tr>, and <td> HTML tags plus non-ASCII characters from a string using C#? 如何使用C#从字符串中删除<table>,<tr>和<td> HTML标记以及非ASCII字符?

I want to leave other tags in the string alone. 我想在字符串中保留其他标签。

简单的Google搜索: http//en.csharp-online.net/Strip_all_HTML_tags

Depending on why you want to do this, I'd recommend against trying. 根据您要执行此操作的原因,建议您不要尝试。 There are many pitfalls, even with Regex. 即使使用Regex,也有很多陷阱。

Personally I'd recommend encoding the input, rather than trying to strip stuff out of it. 我个人建议对输入进行编码,而不是尝试从中剥离内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM