简体   繁体   中英

Docx4j: Delete temporary images when converting to PDF via FOP?

I have a docx file that I want to convert to PDF. I'm converting it via XSL-FO with docx4j-export-fo on linux. Every time I convert a document with images, some images are saved in /tmp folder. I found out that this is because of AbstractConversionImageHandler.java, which will always store the images when using XSL-FO.

I tried setting the 'ImageDirPath' FoSetting, but it is ignored for header/footer images when calculating header/footer area size. It only works for images in the document body.

This setting is ignored in 'FopAreeTreeHelper', which uses FOP with some default settings to calculate header/footer area. So if there is an image, it will be saved into the default "/tmp" directory.

this is my code that converts doc into a pdf:


        private static final String TEMP_IMAGE_DIR_PATH = "/tmp/images";

        public static void convert(WordprocessingMLPackage wordMLPackage, OutputStream output) throws Exception {

        Mapper fontMapper = new BestMatchingMapper();
        wordMLPackage.setFontMapper(fontMapper);

        FOSettings foSettings = new FOSettings(wordMLPackage);
        foSettings.setApacheFopMime("application/pdf");
        foSettings.setImageDirPath(TEMP_IMAGE_DIR_PATH);
        foSettings.setFoDumpFile(null);

        FopFactoryBuilder fopFactoryBuilder = FORendererApacheFOP.getFopFactoryBuilder(foSettings) ;
        FopFactory fopFactory = fopFactoryBuilder.build();

        FOUserAgent foUserAgent = FORendererApacheFOP.getFOUserAgent(foSettings, fopFactory);

        Docx4J.toFO(foSettings, output, Docx4J.FLAG_EXPORT_PREFER_XSL);

        // Clean up, so any ObfuscatedFontPart temp files can be deleted
        if (wordMLPackage.getMainDocumentPart().getFontTablePart()!=null) {
            wordMLPackage.getMainDocumentPart().getFontTablePart().deleteEmbeddedFontTempFiles();
        }
        foSettings = null;
        wordMLPackage = null;

        FileUtils.deleteDirectory(new File(TEMP_IMAGE_DIR_PATH));
    }

Only images from the document body are saved into the 'TEMP_IMAGE_DIR_PATH', which I then delete. But header image is saved in "/tmp" folder. But I don't want to delete ALL images from "/tmp".

Is there any way to have Docx4j or FOP delete these images after the conversion? Or set a default directory?

As per JasonPlutext's comment, this is a bug in the docx4j library. It has been reported as an issue.

The workaround I used is to not put images in header and footer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM