将PDFBoxs PDFont转换为java.awt.Font

Question

I have to read a PDF file and and extract some information from it. 我必须阅读PDF文件并从中提取一些信息。 Therefor I am using PDFBox. 因此，我正在使用PDFBox。 Now I have the problem, that I want to display the results by drawing them on a JPanel. 现在我有一个问题，我想通过在JPanel上绘制结果来显示结果。 But to do this right, I need the font information of the underlying string. 但是要做到这一点，我需要基础字符串的字体信息。

My problem now is, that I found no good way to convert a PDFont to a java.awt.Font . 我现在的问题是，我找不到将PDFont转换为java.awt.Font好方法。 I thought of create some mapping by using the string representation of the PDFont and extract the relevant information from it, like 我想通过使用PDFont的字符串表示形式创建一些映射，并从中提取相关信息，例如

Arial -> new Font("Arial", Font.PLAIN, size);
Arial,Bold -> new Font("Arial", Font.BOLD, size);
//and so on

But this does't work, because the string representation differs for every font, for example 但这不起作用，因为每种字体的字符串表示形式都不同，例如

Times-Roman -> new Font("Times-Roman", Font.PLAIN, size);
Times-Bold -> new Font("Times-Roman", Font.BOLD, size);

Is there a better way to do the converting? 有更好的方法进行转换吗？

Answer 1

This is not possible. 这是不可能的。

Quote from this answer : 引用此答案：

be aware that most PDFs do not include to full, complete fontface when they have a font embedded. 请注意，大多数PDF嵌入了字体后，都不会包含完整的完整字体。 Mostly they include just the subset of glyphs used in the document. 通常，它们仅包括文档中使用的字形的子集。

And indeed, org.apache.pdfbox.pdfviewer.PageDrawer use their own org.apache.pdfbox.rendering.Glyph2D class that acts as bridge between PDFBox and java awt by creating a java.awt.geom.GeneralPath class which can be converted by a transformation to java.awt.Shape that in turn can be drawn by the java.awt.Graphics2D . 确实， org.apache.pdfbox.pdfviewer.PageDrawer使用它们自己的org.apache.pdfbox.rendering.Glyph2D类，通过创建java.awt.geom.GeneralPath类来充当PDFBox和java awt之间的桥梁。对java.awt.Shape的转换，该转换可以由java.awt.Graphics2D绘制。

No java.awt.Font was used in the process, it is useless to look for it. 在此过程中未使用java.awt.Font ，寻找它是没有用的。

Although, if you are 'lucky' about the PDF file and there is actually an entire font embedded inside, then you can grab all PDFont classes and read PDFont -> FontDescriptor -> FontFile2 and output that stream into a file with .ttf extension. 虽然，如果您对PDF文件很“幸运”并且实际上里面嵌入了整个字体，那么您可以获取所有PDFont类并阅读PDFont -> FontDescriptor -> FontFile2并将其流输出到扩展名为.ttf的文件中。 (Once you have the .ttf stream you have the java.awt.Font class too.) （一旦拥有.ttf流，您.ttf拥有java.awt.Font类。）

That's what I gathered in a couple hours after seeing this abandoned question, hope it will help someone. 这是我在看到这个遗弃的问题几小时后收集的，希望它能对某人有所帮助。

将PDFBoxs PDFont转换为java.awt.Font

问题描述

1 个解决方案

解决方案1
1 2019-04-27 10:04:55

将PDFBoxs PDFont转换为java.awt.Font

问题描述

1 个解决方案

解决方案1 1 2019-04-27 10:04:55

解决方案1
1 2019-04-27 10:04:55