简体   繁体   English

如何使用Java iText检查所有使用的字体是否嵌入到PDF中?

[英]How to check that all used fonts are embedded in PDF with Java iText?

How to check that all fonts that are used in a PDF file are embedded in the file with Java and iText? 如何检查PDF文件中使用的所有字体是否都嵌入到Java和iText文件中? I have some existing PDF documents, and I'd like to validate that they use only embedded fonts. 我有一些现有的PDF文档,我想验证他们使用嵌入字体。

This would require checking that no PDF standard fonts are used and other used fonts are embedded in the file. 这需要检查没有使用PDF标准字体,并且其他使用的字体嵌入在文件中。

Look at the ListUsedFonts example from iText in Action. 查看来自iText in Action的ListUsedFonts示例。

http://itextpdf.com/examples/iia.php?id=287 http://itextpdf.com/examples/iia.php?id=287

Looks like this will print out the fonts used in a pdf and if they are embedded. 看起来这将打印出pdf中使用的字体以及它们是否嵌入。

/*
 * This class is part of the book "iText in Action - 2nd Edition"
 * written by Bruno Lowagie (ISBN: 9781935182610)
 * For more info, go to: http://itextpdf.com/examples/
 * This example only works with the AGPL version of iText.
 */

package part4.chapter16;

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Set;
import java.util.TreeSet;

import part3.chapter11.FontTypes;

import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfDictionary;
import com.itextpdf.text.pdf.PdfName;
import com.itextpdf.text.pdf.PdfReader;

public class ListUsedFonts {

    /** The resulting PDF file. */
    public static String RESULT
        = "results/part4/chapter16/fonts.txt";

    /**
     * Creates a Set containing information about the fonts in the src PDF file.
     * @param src the path to a PDF file
     * @throws IOException
     */
    public Set<String> listFonts(String src) throws IOException {
        Set<String> set = new TreeSet<String>();
        PdfReader reader = new PdfReader(src);
        PdfDictionary resources;
        for (int k = 1; k <= reader.getNumberOfPages(); ++k) {
            resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES);
            processResource(set, resources);
        }
        reader.close();
        return set;
    }

    /**
     * Extracts the font names from page or XObject resources.
     * @param set the set with the font names
     * @param resources the resources dictionary
     */
    public static void processResource(Set<String> set, PdfDictionary resource) {
        if (resource == null)
            return;
        PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT);
        if (xobjects != null) {
            for (PdfName key : xobjects.getKeys()) {
                processResource(set, xobjects.getAsDict(key));
            }
        }
        PdfDictionary fonts = resource.getAsDict(PdfName.FONT);
        if (fonts == null)
            return;
        PdfDictionary font;
        for (PdfName key : fonts.getKeys()) {
            font = fonts.getAsDict(key);
            String name = font.getAsName(PdfName.BASEFONT).toString();
            if (name.length() > 8 && name.charAt(7) == '+') {
                name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7));
            }
            else {
                name = name.substring(1);
                PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR);
                if (desc == null)
                    name += " nofontdescriptor";
                else if (desc.get(PdfName.FONTFILE) != null)
                    name += " (Type 1) embedded";
                else if (desc.get(PdfName.FONTFILE2) != null)
                    name += " (TrueType) embedded";
                else if (desc.get(PdfName.FONTFILE3) != null)
                    name += " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded";
            }
            set.add(name);
        }
    }

    /**
     * Main method.
     *
     * @param    args    no arguments needed
     * @throws DocumentException 
     * @throws IOException
     */
    public static void main(String[] args) throws IOException, DocumentException {
        new FontTypes().createPdf(FontTypes.RESULT);
        Set<String> set = new ListUsedFonts().listFonts(FontTypes.RESULT);
        PrintWriter out = new PrintWriter(new FileOutputStream(RESULT));
        for (String fontname : set)
            out.println(fontname);
        out.flush();
        out.close();
    }
}
/**
 * Creates a set containing information about the not-embedded fonts within the src PDF file.
 * @param src the path to a PDF file
 * @throws IOException
 */
public Set<String> listFonts(String src) throws IOException {
    Set<String> set = new TreeSet<String>();
    PdfReader reader = new PdfReader(src);
    PdfDictionary resources;
    for (int k = 1; k <= reader.getNumberOfPages(); ++k) {
        resources = reader.getPageN(k).getAsDict(PdfName.RESOURCES);
        processResource(set, resources);
    }
    reader.close();
    return set;
}

/**
 * Finds out if the font is an embedded subset font
 * @param font name
 * @return true if the name denotes an embedded subset font
 */
private boolean isEmbeddedSubset(String name) {
    //name = String.format("%s subset (%s)", name.substring(8), name.substring(1, 7));
    return name != null && name.length() > 8 && name.charAt(7) == '+';
}

private void processFont(PdfDictionary font, Set<String> set) {
    String name = font.getAsName(PdfName.BASEFONT).toString();
    if(isEmbeddedSubset(name))
        return;

    PdfDictionary desc = font.getAsDict(PdfName.FONTDESCRIPTOR);

    //nofontdescriptor
    if (desc == null) {
        PdfArray descendant = font.getAsArray(PdfName.DESCENDANTFONTS);

        if (descendant == null) {
            set.add(name.substring(1));             
        }
        else {              
            for (int i = 0; i < descendant.size(); i++) {
                PdfDictionary dic = descendant.getAsDict(i);
                processFont(dic, set);                    
              }             
        }            
    }
    /**
     * (Type 1) embedded
     */
    else if (desc.get(PdfName.FONTFILE) != null)
        ;
    /**
     * (TrueType) embedded 
     */
    else if (desc.get(PdfName.FONTFILE2) != null)
        ;
    /**
     * " (" + font.getAsName(PdfName.SUBTYPE).toString().substring(1) + ") embedded" 
     */     
    else if (desc.get(PdfName.FONTFILE3) != null)
        ;
    else {
        set.add(name.substring(1));         
    }
}
/**
 * Extracts the names of the not-embedded fonts from page or XObject resources.
 * @param set the set with the font names
 * @param resources the resources dictionary
 */
public void processResource(Set<String> set, PdfDictionary resource) {
    if (resource == null)
        return;
    PdfDictionary xobjects = resource.getAsDict(PdfName.XOBJECT);
    if (xobjects != null) {
        for (PdfName key : xobjects.getKeys()) {
            processResource(set, xobjects.getAsDict(key));
        }
    }
    PdfDictionary fonts = resource.getAsDict(PdfName.FONT);
    if (fonts == null)
        return;
    PdfDictionary font;
    for (PdfName key : fonts.getKeys()) {
        font = fonts.getAsDict(key);                           
        processFont(font, set);
    }
}

The code above could be used to retrieve the fonts that are not embedded in the given PDF file. 上面的代码可用于检索未嵌入给定PDF文件中的字体。 I've improved the code from iText in Action so that it can handle Font's DescendantFont node, too. 我在Action中改进了iText的代码,以便它也可以处理Font的DescendantFont节点。

When you create Chunk, you declare what font you use. 创建Chunk时,声明您使用的字体。
Create BaseFont from the font you want to use and declare is as BaseFont.EMBEDDED. 从您要使用的字体创建BaseFont并声明为BaseFont.EMBEDDED。
Note that when you not set option subset to true, the whole font will be embedded. 请注意,如果未将option subset设置为true,则将嵌入整个字体。

Be aware that embedding font might violate authorship rights. 请注意,嵌入字体可能会侵犯作者权限。

The simplest answer, is to open the PDF file with Adobe Acrobat then: 最简单的答案是用Adobe Acrobat打开PDF文件,然后:

  1. click on File 单击文件
  2. select Properties 选择属性
  3. click on the Fonts tab 单击“字体”选项卡

This will show you a list of all fonts in the document. 这将显示文档中所有字体的列表。 Any font that is embedded will display "(Embedded)" next to the font name. 嵌入的任何字体都会在字体名称旁边显示“(嵌入)”。

For example: 例如:

ACaslonPro-Bold (Embedded) ACaslonPro-Bold(嵌入式)

where ACaslonPro-Bold is derived from the file name that you embedded it with (eg FontFactory.register("/path/to/ACaslonPro-Bold.otf",... 其中ACaslonPro-Bold源自您嵌入它的文件名(例如FontFactory.register("/path/to/ACaslonPro-Bold.otf",...

I don't think this is an "iText" use case. 我不认为这是一个“iText”用例。 Use either PDFBox or jPod . 使用PDFBoxjPod These implement the PDF model and as such enable you to: 这些实现了PDF模型,因此您可以:

  • open the document 打开文件
  • recurse from the document root down the object tree 从文档根据对象树递减
  • check if this is a font object 检查这是否是一个字体对象
  • check if the font file is available 检查字体文件是否可用

A check if only embedded fonts are used is by far more complex (this is , fonts that are not embedded but not used are fine). 如果只使用嵌入字体支票是要复杂得多(这是,未嵌入,但没有使用的罚款字体)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM