简体   繁体   English

使用 PDFBox 格式化数字

[英]Format Numbers using PDFBox

I have a PDF file (that I cannot edit) with a table into which I can enter numbers.我有一个 PDF 文件(我无法编辑),里面有一个表格,我可以在其中输入数字。 The lowest table cell automatically sums up the input.最低的表格单元格自动总结输入。 When I manually enter the numbers (using Acrobat Reader), they are formatted correctly and the sum works fine, but when using PDFBox, they are not, ie missing the thousands separator, and the sum is not calculated.当我手动输入数字(使用 Acrobat Reader)时,它们的格式正确并且总和工作正常,但是当使用 PDFBox 时,它们不是,即缺少千位分隔符,并且不计算总和。 I can calculate the sum and enter into the field though.我可以计算总和并进入该领域。 This is all in German Locale, btw.顺便说一句,这一切都在德国语言环境中。

After I fill the PDF using PDFBox, other users might edit it using Acrobat Reader and input more numbers or edit them, so the sum has to work properly.在我使用 PDFBox 填写 PDF 后,其他用户可能会使用 Acrobat Reader 对其进行编辑并输入更多数字或对其进行编辑,因此总和必须正常工作。 Here is screenshot of what I mean:这是我的意思的截图: 左边的数字格式不正确

Is there any way to tell the form fields to reformat their input to reflect the format that they have specified internally?有没有办法告诉表单字段重新格式化它们的输入以反映它们在内部指定的格式?

When I manually format my number, which I have as a Double, to the format of "###,##0.00", then the sum does not work anymore.当我手动将我的数字格式化为“###,##0.00”的格式时,总和不再起作用。 When I manually change any of the input, the sum is recalculated and I get an error "The value entered does not match the format of the field".当我手动更改任何输入时,会重新计算总和,并且出现错误“输入的值与字段格式不匹配”​​。 Unfortunately I can't share the file directly, because of confidentiality issues, but I could try to create one of my own using only the table, if needed...不幸的是,由于保密问题,我无法直接共享该文件,但如果需要,我可以尝试仅使用该表创建我自己的文件...

Locale.setDefault(Locale.GERMAN);

File bbb = //obviously instantiated to the where the file is
InputStream in = new FileInputStream(bbb);
PDDocument doc = PDDocument.load(in);
PDAcroForm acro = doc.getDocumentCatalog().getAcroForm();

//using the following line messes up the sum
acro.getField("row1").setValue(new DecimalFormat("###,##0.00").format(1000));

//using the following line works (including sum) but no thousands separator
acro.getField("row1").setValue(new DecimalFormat("###,##0.00").format(1000).replaceAll("\\.", ""));

The problem is, Acrobat Forms, aside from their declarative layout (which is parseable and analyzable via PDFBox) can also have scripts written in JavaScript, which, for obvious reasons (such as lack of a complete PDF data model and interpreter) are not evaluated within PDFBox.问题是,Acrobat Forms 除了它们的声明式布局(可通过 PDFBox 解析和分析)还可以有用 JavaScript 编写的脚本,由于显而易见的原因(例如缺乏完整的 PDF 数据模型和解释器),不会对其进行评估在 PDFBox 中。

You can extract the scripts from the PDF (the form is a standard XML document and the scripts are in relevant script markers) and then try to mimic the behavior of the JavaScript within your Java code.您可以从 PDF 中提取脚本(表单是标准的 XML 文档,脚本位于相关的脚本标记中),然后尝试在您的 Java 代码中模拟 JavaScript 的行为。 Other than that, not much can be done.除此之外,没有太多可以做的。

As mentioned by Piotr Wilkin , the field format is written in JavaScript code.正如Piotr Wilkin所提到的,字段格式是用 JavaScript 代码编写的。 I have used the following code to extract the script markers that have the format:我使用以下代码来提取具有以下格式的脚本标记:

String js = Optional.ofNullable(acroForm.getField(fieldName)).map(PDField::getCOSObject)
        // Additional-actions dictionary. Defining the actions to be taken in response to various trigger events.
        .map(d -> (COSDictionary) d.getDictionaryObject(COSName.AA))
        // F dictionary. A JavaScript action to be performed before the field is formatted to display its current value.
        .map(d -> (COSDictionary) d.getDictionaryObject(COSName.F))
        // JS string. A string or stream containing the JavaScript script to be executed.
        .map(d -> d.getString(COSName.JS))
        .orElse(null);

That will get you the JavaScript code that defines the format.这将为您提供定义格式的 JavaScript 代码。 However, it is then a matter of parsing it however you need and depending on what is the original field type.但是,接下来就是根据需要解析它并取决于原始字段类型是什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM