简体   繁体   English

扫描PDF并转换为缓冲图像以解码QR时Zxing格式异常

[英]Zxing Format Exception in Scanning PDF and converting to Buffered Image to Decode QR

I'm having problem with getting continuous successful QR decoding after PDF conversion. 我在转换PDF后无法连续成功进行QR解码时遇到问题。 I keep getting, 我不断

"Exception in thread "main" com.google.zxing.FormatException." “线程“ main” com.google.zxing.FormatException中的异常。”

My conversion attempts were done in: PDFBox 我的转换尝试在以下位置完成:PDFBox

public static BufferedImage convertPDFtoBufferedImageType2(String PDFPath) throws IOException{

    PDDocument document = null;
    try {

        document = PDDocument.load(PDFPath);
        PDPage firstPage = (PDPage) document.getDocumentCatalog().getAllPages().get(0);
        return firstPage.convertToImage();

    } catch (IOException ex) {
        Logger.getLogger(PDF_Utility.class.getName()).log(Level.SEVERE, null, ex);
        return null;
    } finally {
        if(document != null)
            document.close();
    }
}

Second Attempt with ghost4j 与ghost4j的第二次尝试

public static BufferedImage convertPDFtoBufferedImage(String PDFPath) throws IOException, RendererException, DocumentException{

    System.setProperty("jna.library.path", "C:\\Program Files\\gs\\gs9.16\\bin\\");

    PDFDocument document = new PDFDocument();
    document.load(new File(PDFPath));
    SimpleRenderer renderer = new SimpleRenderer();
    renderer.setResolution(300);
    List<Image> imgs = renderer.render(document);
    Image im = imgs.get(0);

    BufferedImage bi = new BufferedImage
        (im.getWidth(null),im.getHeight(null),BufferedImage.TYPE_INT_RGB);
    Graphics bg = bi.getGraphics();
    bg.drawImage(im, 0, 0, null);
    bg.dispose();
    return bi;
}

My QR Decoder is: 我的QR解码器是:

public static String readQRCode(BufferedImage image, String charset, Map hintMap) 
                throws FileNotFoundException, IOException, NotFoundException, ChecksumException, FormatException {
        Result qrCodeResult = null;
        BinaryBitmap binaryBitmap = new BinaryBitmap(
                        new HybridBinarizer(new BufferedImageLuminanceSource(image)));
        try{
            qrCodeResult = new com.google.zxing.qrcode.QRCodeReader().decode(binaryBitmap,hintMap);
        }catch(NotFoundException | FormatException e){ //attempt without hints
            qrCodeResult = new com.google.zxing.qrcode.QRCodeReader().decode(binaryBitmap);
        }
        return qrCodeResult.getText();
}

And the reason why I called decode twice was because sometimes the "try harder" 我之所以叫两次解码,是因为有时“努力一点”

hintMap.put(DecodeHintType.TRY_HARDER, Boolean.TRUE);

actually didn't catch the QR code, but the default did. 实际上没有捕获QR码,但默认捕获了。 Anyways, these code snippets do catch most of my QR scans from a pile of documents, but there are times where it does not catch it at all. 无论如何,这些代码片段确实从一堆文档中捕获了我的大部分QR扫描,但是有时却根本无法捕获。 I even attempted to write it out as an image and then re-read it in: 我什至试图将其写为图像,然后重新读取:

ImageIO.write((RenderedImage) im, "png", new File("/path/to/my/img.png"));

Interestingly, http://zxing.org/w/decode.jspx does decode that output image, but my code couldn't. 有趣的是, http://zxing.org/w/decode.jspx确实对该输出图像进行了解码,但是我的代码却不能。 I also tried different charset: CHAR_SET = "UTF-8"; 我还尝试了不同的字符集:CHAR_SET =“ UTF-8”; and CHAR_SET = "ISO-8859-1"; 和CHAR_SET =“ ISO-8859-1”;

By getting Format Exceptions, the code was found, but "did not conform to the barcode's format rules. This could have been due to a mis-detection." 通过获取格式异常,找到了代码,但是“不符合条形码的格式规则。这可能是由于检测错误所致。”

Apology for the messy code, but those attempts have gained majority of successful scans. 为混乱的代码道歉,但这些尝试已获得成功扫描的大多数。 9/10 rate? 9/10率? Interestingly, sometimes another scanned copy of the same doc worked. 有趣的是,有时同一文档的另一个扫描副本也起作用。 Any help/advice/crazy voodoo combination is appreciated! 任何帮助/建议/疯狂的伏都教的组合表示赞赏! Thanks! 谢谢!

EDIT: I got a sample (after whiting out the contents around. The real image has contents! Zxing website was able to catch this QR code too (with and without contents! (My program already ignored the other 1Ds at this same format and those with contents). 编辑:我得到了一个示例(将周围的内容删除。真实的图像中包含内容!Zxing网站也能够捕获此QR码(有无内容!(我的程序已经忽略了其他相同格式的1D和那些内容)。 二维码

@Tilman Hausherr pointed out for the PDFBox default rendering size as low so I changed the default to 300dpi as he suggested. @Tilman Hausherr指出PDFBox的默认呈现尺寸要小,因此我按照他的建议将默认值更改为300dpi。 Overall, it worked for my case but definitely slowed down the speed. 总体而言,它适用于我的情况,但肯定会减慢速度。 Will need to tweak my algorithm to run both a fast and this slower one as a backup. 将需要调整我的算法以运行快速和慢速算法作为备份。

return firstPage.convertToImage(BufferedImage.TYPE_4BYTE_ABGR, 300);

EDIT: Increased the success rate of catching barcodes, but did not successfully catch all. 编辑:提高了捕获条形码的成功率,但没有成功捕获所有条形码。 Increasing the dpi does not help. 增加dpi并没有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM