简体   繁体   中英

Create PDF Table from HTML String with UTF-8 encofing

I want to create PDF table from HTML string. I can create that table, but instead of Text, I'm getting question marks. Here is my code:

public class ExportReportsToPdf implements StreamSource {
private static final long serialVersionUID = 1L;

private ByteArrayOutputStream byteArrayOutputStream;

public static final String FILE_LOC = "C:/Users/KiKo/CasesWorkspace/case/Export.pdf";

private static final String CSS = ""
        + "table {text-align:center; margin-top:20px; border-collapse:collapse; border-spacing:0; border-width:1px;}"
        + "th {font-size:14px; font-weight:normal; padding:10px; border-style:solid; overflow:hidden; word-break:normal;}"
        + "td {padding:10px; border-style:solid; overflow:hidden; word-break:normal;}"
        + "table-header {font-weight:bold; background-color:#EAEAEA; color:#000000;}";

public void createReportPdf(String tableHtml, Integer type) throws IOException, DocumentException {

    // step 1
    Document document = new Document(PageSize.A4, 20, 20, 50, 20);

    // step 2
    PdfWriter.getInstance(document, new FileOutputStream(FILE_LOC));

    // step 3
    byteArrayOutputStream = new ByteArrayOutputStream();
    PdfWriter writer = PdfWriter.getInstance(document, byteArrayOutputStream);
    if (type != null) {
        writer.setPageEvent(new Watermark());
    }

    // step 4
    document.open();

    // step 5
    document.add(getTable(tableHtml));

    // step 6
    document.close();
}

private PdfPTable getTable(String tableHtml) throws IOException {

    // CSS
    CSSResolver cssResolver = new StyleAttrCSSResolver();
    CssFile cssFile = XMLWorkerHelper.getCSS(new ByteArrayInputStream(CSS.getBytes()));
    cssResolver.addCss(cssFile);

    // HTML
    HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
    htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());

    // Pipelines
    ElementList elements = new ElementList();
    ElementHandlerPipeline pdf = new ElementHandlerPipeline(elements, null);
    HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
    CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

    // XML Worker
    XMLWorker worker = new XMLWorker(css, true);
    XMLParser parser = new XMLParser(worker);

    InputStream inputStream = new byteArrayInputStream(tableHtml.getBytes());
    parser.parse(inputStream);

    return (PdfPTable) elements.get(0);
}

private static class Watermark extends PdfPageEventHelper {

    @Override
    public void onEndPage(PdfWriter writer, Document document) {
        try {
            URL url = Thread.currentThread().getContextClassLoader().getResource("/images/memotemp.jpg");
            Image background = Image.getInstance(url);
            float width = document.getPageSize().getWidth();
            float height = document.getPageSize().getHeight();
            writer.getDirectContentUnder().addImage(background, width, 0, 0, height, 0, 0);
        } catch (DocumentException | IOException e) {
            e.printStackTrace();
        }
    }
}

@Override
public InputStream getStream() {
    return new ByteArrayInputStream(byteArrayOutputStream.toByteArray());
}

}

This code is working, and I'm getting this: 不好

I've try to add UTF-8,

InputStream inputStream = new byteArrayInputStream(tableHtml.getBytes("UTF-8"));

but than I'm getting this: 不好(utf8)

I want to get something like this:

好

I think the problem is with the encoding, but I don't know how to solve this bug. Any suggestions...?

To get bytes from a (Unicode) String in some encoding, specify it, otherwise the default system encoding is used.

tableHtml.getBytes(StandardCharsets.UTF_8)

In your case however "Windows-1251" seems a better match as the PDF does not seem to use UTF-8.

Maybe the original tableHTML String was read with the wrong encoding. Might check that, if it came from file or database.

You need to tell iText what encoding to use by creating an instance of the BaseFont class. Then in your document.add(getTable(tableHtml)); you can add a call to the font. Example at http://itextpdf.com/examples/iia.php?id=199 .

I can't tell how you create a table but the class PdfPTable has a method addCell(PdfCell) and one constructor for PdfCell takes a Phrase . The Phrase can be constructed with a String and a Font . The font class takes a BaseFont as a constructor argument.

If you look around the Javadoc for iText you will see various classes take a Font as a constructor argument.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM