简体   繁体   English

如何使用PDFBOX Java提取字体颜色?

[英]How to extract Font color using PDFBOX java?

Actually I need to extract font color of each character , found this below piece of code from a forum but while executing it throws me the error 实际上我需要提取每个字符的字体颜色,在论坛的代码段下面找到了这个,但是在执行时却抛出了错误

Apr 19, 2013 6:23:45 PM org.apache.pdfbox.util.operator.pagedrawer.FillNonZeroRule process
WARNING: java.lang.ClassCastException: org.apache.pdfbox.util.PDFStreamEngine cannot be cast to org.apache.pdfbox.pdfviewer.PageDrawer
java.lang.ClassCastException: org.apache.pdfbox.util.PDFStreamEngine cannot be cast to org.apache.pdfbox.pdfviewer.PageDrawer"



PDDocument doc = null;
try {
    doc = PDDocument.load("C:/Path/To/Pdf/Sample.pdf");
    PDFStreamEngine engine = new PDFStreamEngine(ResourceLoader.loadProperties("org/apache/pdfbox/resources/PageDrawer.properties"));
    PDPage page = (PDPage)doc.getDocumentCatalog().getAllPages().get(0);
    engine.processStream(page, page.findResources(), page.getContents().getStream());
    PDGraphicsState graphicState = engine.getGraphicsState();
    System.out.println(graphicState.getStrokingColor().getColorSpace().getName());
    float colorSpaceValues[] = graphicState.getStrokingColor().getColorSpaceValue();
    for (float c : colorSpaceValues) {
        System.out.println(c * 255);
    }
}
finally {
    if (doc != null) {
        doc.close();
    }

Can any one help me out thanks 谁能帮我一下谢谢

Have a look at org.apache.pdfbox.pdfviewer.PageDrawer which contains: 看一下org.apache.pdfbox.pdfviewer.PageDrawer ,其中包含:

protected void processTextPosition( TextPosition text )
{
    try
    {
        PDGraphicsState graphicsState = getGraphicsState();
        Composite composite;
        Paint paint;
        switch(graphicsState.getTextState().getRenderingMode()) 
        {
            case PDTextState.RENDERING_MODE_FILL_TEXT:
                composite = graphicsState.getNonStrokeJavaComposite();
                paint = graphicsState.getNonStrokingColor().getJavaColor();
                if (paint == null)
                {
                    paint = graphicsState.getNonStrokingColor().getPaint(pageSize.height);
                }
                break;
            case PDTextState.RENDERING_MODE_STROKE_TEXT:
                composite = graphicsState.getStrokeJavaComposite();
                paint = graphicsState.getStrokingColor().getJavaColor();
                if (paint == null)
                {
                    paint = graphicsState.getStrokingColor().getPaint(pageSize.height);
                }
                break;
            case PDTextState.RENDERING_MODE_NEITHER_FILL_NOR_STROKE_TEXT:
                //basic support for text rendering mode "invisible"
                Color nsc = graphicsState.getStrokingColor().getJavaColor();
                float[] components = {Color.black.getRed(),Color.black.getGreen(),Color.black.getBlue()};
                paint = new Color(nsc.getColorSpace(),components,0f);
                composite = graphicsState.getStrokeJavaComposite();
                break;
            default:
                // TODO : need to implement....
                LOG.debug("Unsupported RenderingMode "
                        + this.getGraphicsState().getTextState().getRenderingMode()
                        + " in PageDrawer.processTextPosition()."
                        + " Using RenderingMode "
                        + PDTextState.RENDERING_MODE_FILL_TEXT
                        + " instead");
                composite = graphicsState.getNonStrokeJavaComposite();
                paint = graphicsState.getNonStrokingColor().getJavaColor();
        }
        graphics.setComposite(composite);
        graphics.setPaint(paint);

        PDFont font = text.getFont();
        Matrix textPos = text.getTextPos().copy();
        float x = textPos.getXPosition();
        // the 0,0-reference has to be moved from the lower left (PDF) to the upper left (AWT-graphics)
        float y = pageSize.height - textPos.getYPosition();
        // Set translation to 0,0. We only need the scaling and shearing
        textPos.setValue(2, 0, 0);
        textPos.setValue(2, 1, 0);
        // because of the moved 0,0-reference, we have to shear in the opposite direction
        textPos.setValue(0, 1, (-1)*textPos.getValue(0, 1));
        textPos.setValue(1, 0, (-1)*textPos.getValue(1, 0));
        AffineTransform at = textPos.createAffineTransform();
        PDMatrix fontMatrix = font.getFontMatrix();
        at.scale(fontMatrix.getValue(0, 0) * 1000f, fontMatrix.getValue(1, 1) * 1000f);
        //TODO setClip() is a massive performance hot spot. Investigate optimization possibilities
        graphics.setClip(graphicsState.getCurrentClippingPath());
        // the fontSize is no longer needed as it is already part of the transformation
        // we should remove it from the parameter list in the long run
        font.drawString( text.getCharacter(), text.getCodePoints(), graphics, 1, at, x, y );
    }
    catch( IOException io )
    {
        io.printStackTrace();
    }
}

and which shows how to extract colours and other attributes. 并说明如何提取颜色和其他属性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM