简体   繁体   English

PDFBox API:如何更改字体以处理 AcroForm 字段中的 Cyrillic 值

[英]PDFBox API: How to change font to handle Cyrillic values in an AcroForm field

I need help with adding Cyrillic value to a field using the PDFBox API .我需要使用PDFBox API向字段添加西里尔文值的帮助。 Here is what I have so far:这是我到目前为止所拥有的:

PDDocument document = PDDocument.load(file);
PDDocumentCatalog dc = document.getDocumentCatalog();
PDAcroForm acroForm = dc.getAcroForm();
PDField naziv = acroForm.getField("naziv");
naziv.setValue("Наслов"); // this part right here
naziv.setValue("Naslov"); // it works like this

It works perfect when my input is in Latin Alphabet.当我的输入是拉丁字母时,它工作得很好。 But I need to handle Cyrillic inputs as well.但我也需要处理西里尔文输入。 How can I do it?我该怎么做?

ps this is the exception I get: Caused by: java.lang.IllegalArgumentException: U+043D ('afii10079') is not available in this font Helvetica encoding: WinAnsiEncoding ps 这是我得到的异常:由:java.lang.IllegalArgumentException:U+043D ('afii10079') 在此字体中不可用 Helvetica 编码:WinAnsiEncoding

The code below adds an appropriate font in the acroform default resource dictionary, and replaces the name in the default appearances.下面的代码在 acroform 默认资源字典中添加了适当的字体,并替换了默认外观中的名称。 PDFBox recreates the appearance stream of the fields using the new font when you call setValue().当您调用 setValue() 时,PDFBox 使用新字体重新创建字段的外观流。

public static void main(String[] args) throws IOException
{
    PDDocument doc = PDDocument.load(new File("ZPe.pdf"));
    PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
    PDResources dr = acroForm.getDefaultResources();

    // Important: the font is Type0 (allows more than 256 glyphs) and NOT SUBSETTED
    PDFont font = PDType0Font.load(doc, new FileInputStream("c:/windows/fonts/arial.ttf"), false);

    COSName fontName = dr.add(font);
    Iterator<PDField> it = acroForm.getFieldIterator();
    while (it.hasNext())
    {
        PDField field = it.next();
        if (field instanceof PDTextField)
        {
            PDTextField textField = (PDTextField) field;
            String da = textField.getDefaultAppearance();

            // replace font name in default appearance string
            Pattern pattern = Pattern.compile("\\/(\\w+)\\s.*");
            Matcher matcher = pattern.matcher(da);
            if (!matcher.find() || matcher.groupCount() < 2)
            {
                // oh-oh
            }
            String oldFontName = matcher.group(1);
            da = da.replaceFirst(oldFontName, fontName.getName());

            textField.setDefaultAppearance(da);
        }
    }
    acroForm.getField("name1").setValue("Наслов");
    doc.save("result.pdf");
    doc.close();
}

Update 4.4.2019: to save some space, it may be useful to remove the appearance before calling setValue: 2019 年 4 月 4 日更新:为了节省一些空间,在调用 setValue 之前移除外观可能会很有用:

acroForm.getField("name1").getWidgets().get(0).setAppearance(null);

to check whether there are unused fonts in the AcroForm default resources, see this answer .要检查 AcroForm 默认资源中是否有未使用的字体,请参阅此答案

Update 7.4.2019: you may experience poor performance if the font is very large (eg ArialUni) and many fields are to be set ( PDFBOX-4508 ). 2019 年 4 月 7 日更新:如果字体非常大(例如 ArialUni)并且要设置许多字段( PDFBOX-4508 ),您可能会遇到性能不佳的情况。 In that case, save and reload the file before calling setValue .在这种情况下,请在调用setValue之前保存并重新加载文件。

To find out whether a font supports an intended text, call PDFont.encode() and check for IllegalArgumentException .要确定字体是否支持预期文本,请调用PDFont.encode()并检查IllegalArgumentException

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM