简体   繁体   English

iTextrenderer创建的pdf中的html内容丢失

[英]content from html missing in pdf created by iTextrenderer

I am trying to create pdf from one html which has chinese char. 我正在尝试从具有中文字符的html创建pdf。 in this i have got weird prob. 在这一点上,我有奇怪的概率。 the line from html which has chinese char is not completely shown in pdf generated from it. 带有中文字符的html行未完全显示在由此产生的pdf中。

Below is my html: 以下是我的html:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1?DTD/transitional.dtd">
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>some title.</title>

<style type="text/css">
     .name
   {
         font-family: "Arial Unicode MS";
         color:red;
         margin-left: 5px;
         margin-right: 5px
     }
</style>
</head>
<body>
 <b class="name">

LLTRN,DEBIT,,,6841,FXW,,CNY,PAY,C,,,,DD,,ord par nm,,,,,,,CN,百威英博雪津(三明)啤酒有限公司,,,,,,,CN,20140617,,CNY,647438.24,OUR,,,,,,,,SHANGHAI,CN,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

    <br>

RDF,FTX,TEXT
<br>
</b>
<br>
</body></html>

below is my itext renderer code: 以下是我的itext渲染器代码:

StringWriter writer = new StringWriter();
Tidy tidy = new Tidy();
tidy.setTidyMark(false);
tidy.setDocType("omit");
tidy.setXHTML(true);
tidy.setInputEncoding("utf-8");
tidy.setOutputEncoding("utf-8");
//tidy.parse(new StringReader(documentJsoup.toString()), writer);
tidy.parse(new StringReader(inputFileString), writer);
writer.close();
String  pdfContent = writer.toString();

// Creating an instance of iText renderer which will be used to generate the pdf from the html document.
ITextRenderer renderer = new ITextRenderer();           

/*renderer.setDocument(doc, baseurl);
renderer.layout();
renderer.createPDF(os);
os.flush();         

// close all the streams
//fis.close();
//os.close();
//instream.close();
 */
ITextFontResolver resolver = renderer.getFontResolver();

//renderer.getFontResolver().addFont("C:\\Windows\\Fonts\\arialuni.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
resolver.addFont("C:\\Windows\\Fonts\\arialuni.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
renderer.setDocumentFromString(pdfContent);
renderer.layout();
renderer.createPDF(os);

since i used font resolver and add font, chinese char are shown.... but pdf shows missing content.... last characters of that line (thats :"AI" from "shanghai" and next ",CN,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,") is not visible.... its seen something like: 由于我使用了字体解析器并添加了字体,因此显示了中文字符。...但pdf显示了缺少的内容....该行的最后一个字符(多数民众赞成:“ AI”来自“上海”,下一个“,CN ,,,, 、、、、、、、、、、、、、、、、、、、、、、、、、、“)是不可见的。

html2pdf:内容丢失

i tried a lot to see whats wrong but couldnt find solution. 我尝试了很多,看看有什么问题,但找不到解决方案。 can anybody from u help me in resolving this issue pls ?? 有人可以帮助我解决这个问题吗? TIA! TIA!

The issue is that Flying-saucer doesn't manage line wrapping in chinese text. 问题是飞碟无法管理中文文本的换行。 It only insert line break on whitespaces. 它仅在空白处插入换行符。 In your case, it means it cannot insert a line break after "nm,,,,", and it doesn't fit on the line. 在您的情况下,这意味着它无法在“ nm 、、、、”之后插入换行符,并且该行也不适合。

It is a known bug in flying saucer (see here ), but it's unlikely to be fixed soon. 这是飞碟中的已知错误(请参见此处 ),但不太可能很快修复。

The only workaround is to insert a whitespace anywhere in your string after the Chinese characters. 唯一的解决方法是在中文字符后的字符串中的任何位置插入空格。 It will make all the text visible. 它将使所有文本可见。

Here you need to add font type or font file in your application. 在这里,您需要在应用程序中添加字体类型或字体文件。

you can find code here itextSharp - html to pdf some turkish characters are missing 您可以在这里找到代码itextSharp-html到pdf缺少一些土耳其字符

this question is also same as your question.. 这个问题也和你的问题相同。

if this helps you then please give points. 如果这样可以帮助您,请给点意见。

I tried adding below css rules into the body class and it worked perfectly. 我尝试将以下CSS规则添加到body类中,并且效果很好。

word-wrap: break-word; word-break: break-all;

"Adding whitespaces" works sometimes (I tried adding spaces after symbols like 。 or 、), but sometimes when there's no symbols it still overflows. 有时可以使用“添加空格”(我曾尝试在诸如或的符号后添加空格),但是有时当没有符号时,它仍然会溢出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM