简体   繁体   中英

Problems with HTML content in generated PDF

我从HTML生成PDF,但是与其将其解释为普通文本,不如将我的PDF页面填充为html标签,例如<p><li>等。

You'll need to remove all tags and unescape special chars.

PHP example:

$text = preg_replace($html, '<[^>]*>', '');
$text = html_entity_decode($text);

VB.NET example:

Dim text As String = Regex.Replace(html, "<[^>]*>", "")
text = System.Web.WebUtility.HtmlDecode(text)

Java example:

text = html.replaceAll("<[^>]*>", "");

For the html entity decoding you'll find a good answer here: Java: How to unescape HTML character entities in Java? . Otherwise you could just replace them if you know all of them ( &nbsp; , &quot; , ...).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM