简体   繁体   中英

Convert a document to PDF using "Microsoft Print to PDF" and Java

I am currently testing to convert RTF/DOC documents into PDF on a Microsoft Windows host. I have a working peace of code that uses the API of Microsoft Word but due to license costs I would like to get rid of it.

I had the idea that it might be possible to convert RTFs to PDF by just "sending" them to the Microsoft Print To PDF printer.

The issue I got here is that I can on the one hand access the printer and I also get an output, but the file is corrupted.

If I just rename the generated file back from.pdf to.rtf and open it in Microsoft Word the content looks something like this (it is just an excerpt of the whole content):

\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff1\deff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deflang1031\deflangfe1031\themelang1031\themelangfe0\themelangcs0{\fonttbl{\f0\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\f1\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial{\*\falt Arial};}{\f2\fbidi \fmodern\fcharset0\fprq1{\*\panose 02070309020205020404}Courier New{\*\falt ?l?r ???fc};}
{\f3\fbidi \froman\fcharset2\fprq2{\*\panose 05050102010706020507}Symbol{\*\falt Symbol};}{\f10\fbidi \fnil\fcharset2\fprq2{\*\panose 05000000000000000000}Wingdings;}{\f34\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria Math;}
{\f38\fbidi \fswiss\fcharset0\fprq2{\*\panose 020b0604030504040204}Tahoma;}{\f39\fbidi \fswiss\fcharset0\fprq2{\*\panose 00000000000000000000}Arial Black;}
{\flomajor\f31500\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}{\fdbmajor\f31501\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\fhimajor\f31502\fbidi \froman\fcharset0\fprq2{\*\panose 02040503050406030204}Cambria;}{\fbimajor\f31503\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\flominor\f31504\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}{\fdbminor\f31505\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\fhiminor\f31506\fbidi \fswiss\fcharset0\fprq2{\*\panose 020f0502020204030204}Calibri;}{\fbiminor\f31507\fbidi \froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman{\*\falt Courier New};}
{\f40\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}{\f41\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr{\*\falt Courier New};}{\f43\fbidi \froman\fcharset161\fprq2 Times New Roman Greek{\*\falt Courier New};}
{\f44\fbidi \froman\fcharset162\fprq2 Times New Roman Tur{\*\falt Courier New};}{\f45\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew){\*\falt Courier New};}{\f46\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic){\*\falt Courier New};}
{\f47\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic{\*\falt Courier New};}{\f48\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese){\*\falt Courier New};}{\f50\fbidi \fswiss\fcharset238\fprq2 Arial CE{\*\falt Arial};}
{\f51\fbidi \fswiss\fcharset204\fprq2 Arial Cyr{\*\falt Arial};}{\f53\fbidi \fswiss\fcharset161\fprq2 Arial Greek{\*\falt Arial};}{\f54\fbidi \fswiss\fcharset162\fprq2 Arial Tur{\*\falt Arial};}
{\f55\fbidi \fswiss\fcharset177\fprq2 Arial (Hebrew){\*\falt Arial};}{\f56\fbidi \fswiss\fcharset178\fprq2 Arial (Arabic){\*\falt Arial};}{\f57\fbidi \fswiss\fcharset186\fprq2 Arial Baltic{\*\falt Arial};}
{\f58\fbidi \fswiss\fcharset163\fprq2 Arial (Vietnamese){\*\falt Arial};}{\f60\fbidi \fmodern\fcharset238\fprq1 Courier New CE{\*\falt ?l?r ???fc};}{\f61\fbidi \fmodern\fcharset204\fprq1 Courier New Cyr{\*\falt ?l?r ???fc};}
{\f63\fbidi \fmodern\fcharset161\fprq1 Courier New Greek{\*\falt ?l?r ???fc};}{\f64\fbidi \fmodern\fcharset162\fprq1 Courier New Tur{\*\falt ?l?r ???fc};}{\f65\fbidi \fmodern\fcharset177\fprq1 Courier New (Hebrew){\*\falt ?l?r ???fc};}
{\f66\fbidi \fmodern\fcharset178\fprq1 Courier New (Arabic){\*\falt ?l?r ???fc};}{\f67\fbidi \fmodern\fcharset186\fprq1 Courier New Baltic{\*\falt ?l?r ???fc};}{\f68\fbidi \fmodern\fcharset163\fprq1 Courier New (Vietnamese){\*\falt ?l?r ???fc};}
{\f380\fbidi \froman\fcharset238\fprq2 Cambria Math CE;}{\f381\fbidi \froman\fcharset204\fprq2 Cambria Math Cyr;}{\f383\fbidi \froman\fcharset161\fprq2 Cambria Math Greek;}{\f384\fbidi \froman\fcharset162\fprq2 Cambria Math Tur;}
{\f387\fbidi \froman\fcharset186\fprq2 Cambria Math Baltic;}{\f388\fbidi \froman\fcharset163\fprq2 Cambria Math (Vietnamese);}{\f420\fbidi \fswiss\fcharset238\fprq2 Tahoma CE;}{\f421\fbidi \fswiss\fcharset204\fprq2 Tahoma Cyr;}
{\f423\fbidi \fswiss\fcharset161\fprq2 Tahoma Greek;}{\f424\fbidi \fswiss\fcharset162\fprq2 Tahoma Tur;}{\f425\fbidi \fswiss\fcharset177\fprq2 Tahoma (Hebrew);}{\f426\fbidi \fswiss\fcharset178\fprq2 Tahoma (Arabic);}
{\f427\fbidi \fswiss\fcharset186\fprq2 Tahoma Baltic;}{\f428\fbidi \fswiss\fcharset163\fprq2 Tahoma (Vietnamese);}{\f429\fbidi \fswiss\fcharset222\fprq2 Tahoma (Thai);}{\f430\fbidi \fswiss\fcharset238\fprq2 Arial Black CE;}
{\f431\fbidi \fswiss\fcharset204\fprq2 Arial Black Cyr;}{\f433\fbidi \fswiss\fcharset161\fprq2 Arial Black Greek;}{\f434\fbidi \fswiss\fcharset162\fprq2 Arial Black Tur;}{\f437\fbidi \fswiss\fcharset186\fprq2 Arial Black Baltic;}
{\flomajor\f31508\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}{\flomajor\f31509\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr{\*\falt Courier New};}
{\flomajor\f31511\fbidi \froman\fcharset161\fprq2 Times New Roman Greek{\*\falt Courier New};}{\flomajor\f31512\fbidi \froman\fcharset162\fprq2 Times New Roman Tur{\*\falt Courier New};}
{\flomajor\f31513\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew){\*\falt Courier New};}{\flomajor\f31514\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic){\*\falt Courier New};}
{\flomajor\f31515\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic{\*\falt Courier New};}{\flomajor\f31516\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese){\*\falt Courier New};}
{\fdbmajor\f31518\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}{\fdbmajor\f31519\fbidi \froman\fcharset204\fprq2 Times New Roman Cyr{\*\falt Courier New};}
{\fdbmajor\f31521\fbidi \froman\fcharset161\fprq2 Times New Roman Greek{\*\falt Courier New};}{\fdbmajor\f31522\fbidi \froman\fcharset162\fprq2 Times New Roman Tur{\*\falt Courier New};}
{\fdbmajor\f31523\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew){\*\falt Courier New};}{\fdbmajor\f31524\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic){\*\falt Courier New};}
{\fdbmajor\f31525\fbidi \froman\fcharset186\fprq2 Times New Roman Baltic{\*\falt Courier New};}{\fdbmajor\f31526\fbidi \froman\fcharset163\fprq2 Times New Roman (Vietnamese){\*\falt Courier New};}
{\fhimajor\f31528\fbidi \froman\fcharset238\fprq2 Cambria CE;}{\fhimajor\f31529\fbidi \froman\fcharset204\fprq2 Cambria Cyr;}{\fhimajor\f31531\fbidi \froman\fcharset161\fprq2 Cambria Greek;}{\fhimajor\f31532\fbidi \froman\fcharset162\fprq2 Cambria Tur;}
{\fhimajor\f31535\fbidi \froman\fcharset186\fprq2 Cambria Baltic;}{\fhimajor\f31536\fbidi \froman\fcharset163\fprq2 Cambria (Vietnamese);}{\fbimajor\f31538\fbidi \froman\fcharset238\fprq2 Times New Roman CE{\*\falt Courier New};}

I am assuming that I do not read the file correctly. Or maybe not writing it correctly? Not sure. Maybe an attribute is missing - I just think it is a little thing that it is wrong.

I have the following code:

import javax.print.Doc;
import javax.print.DocFlavor;
import javax.print.DocPrintJob;
import javax.print.PrintService;
import javax.print.PrintServiceLookup;
import javax.print.SimpleDoc;
import javax.print.attribute.HashPrintRequestAttributeSet;
import javax.print.attribute.PrintRequestAttributeSet;
import javax.print.attribute.standard.Copies;
import javax.print.attribute.standard.Destination;
import javax.print.event.PrintJobAdapter;
import javax.print.event.PrintJobEvent;
import java.io.File;
import java.io.FileInputStream;
import java.net.URISyntaxException;
import java.util.Arrays;


public class Program {


    public static final String myPath = "C:/Project Files/Template";
    public static final String myFile =  "CreditNoteEnglish.rtf";
    public static final String myFile2 =  "CreditNoteEnglish.pdf";
    public static void  main (String[] args)
    {
        try {
            convertToPDF_PerPrint(myPath, myFile);
        } catch (URISyntaxException e) {
            e.printStackTrace();
        }

    }

    private static void convertToPDF_PerPrint( String Verzeichnis,  String pFileName) throws URISyntaxException {
        final String defaultPrinterName = "Microsoft Print To PDF";
        DocFlavor docType = DocFlavor.INPUT_STREAM.AUTOSENSE;
        PrintRequestAttributeSet printerSettings = new HashPrintRequestAttributeSet();
        PrintService PDFPrinter = null;
        File myFile = new File(Verzeichnis + "/" + pFileName);
        File outFile = new File (Verzeichnis + "/" + myFile2);
       // printerSettings.add(MediaSizeName.ISO_A4);
        printerSettings.add(new Destination(outFile.toURI()));
        printerSettings.add(new Copies(1));
        PrintService[] printServices = PrintServiceLookup.lookupPrintServices(docType,printerSettings);
        try
        {
            if(printServices.length == 0)
            {
                throw new Exception("No printers found for given attributes");
            }
            System.out.println ( "Available printers: " + Arrays.asList ( printServices ) );

            for(PrintService availableService : printServices)
            {
                if(availableService.getName().contains("PDF"))
                {
                    PDFPrinter = availableService;
                    break;
                }
            }            if (PDFPrinter == null)
        {
            throw new IllegalStateException("Can not find PDF printer.");
        }

            FileInputStream fileAsStream = new FileInputStream(myFile);

            System.out.println ( Verzeichnis + "\\" + pFileName );
            System.out.println ( fileAsStream.read() );
            DocPrintJob myPrintJob =  PDFPrinter.createPrintJob();
            Doc myConvertableFile = new SimpleDoc(fileAsStream, DocFlavor.INPUT_STREAM.AUTOSENSE,null);
            PrintJobWatcher watcher = new PrintJobWatcher(myPrintJob);
            myPrintJob.print(myConvertableFile, printerSettings);
            watcher.waitForDone();
            fileAsStream.close();
        }
        catch(Exception e)
        {
            System.out.println(e);
        }
    }
}

class PrintJobWatcher {

    boolean done = false;

    PrintJobWatcher(DocPrintJob job) {
        job.addPrintJobListener(new PrintJobAdapter() {
            public void printJobCanceled(PrintJobEvent pje) {
                allDone();
            }

            public void printJobCompleted(PrintJobEvent pje) {
                allDone();
            }

            public void printJobFailed(PrintJobEvent pje) {
                allDone();
            }

            public void printJobNoMoreEvents(PrintJobEvent pje) {
                allDone();
            }

            void allDone() {
                synchronized (PrintJobWatcher.this) {
                    done = true;
                    System.out.println("Printing done ...");
                    PrintJobWatcher.this.notify();
                }
            }
        });
    }

    public synchronized void waitForDone() {
        try {
            while (!done) {
                wait();
            }
        } catch (InterruptedException e) {
        }
    }
}

Does anyone have an idea why it is not possible to make Microsoft Print to PDF Printer generating a proper PDF when using the code above?

Any hints will be highly appreciated.

Thank you very much.

By far the simplest way to translate MS RTF/Doc/DocX/Odt to MS PDF without libraries is via the already licensed shell, there are limitations in that the RTF must be WordPad compatible without throwing a "some features are not supported" message. Note tables must have a border width to be printed as visible lines and transparency in images can give odd results so ideally keep them simple. Background images are rarely supported. Keep RTF stupidly simple as if written in a plain text line editor or batch file. the page will be current MS PDF default (here it is using previous A4 Landscape) unless you pre-adjust with PrintUI.

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM