简体   繁体   English

如何检查PDF文件是否使用嵌入字体?

[英]How can I check if a PDF file is using embedded fonts?

I have a folder where multiple clients upload multiple PDF files. 我有一个文件夹,多个客户端可以在其中上传多个PDF文件。 Some of them are using embedded fonts, some doesn't. 其中一些使用嵌入字体,有些则没有。
I've been working on a service that optimizes (in terms of file size) the PDF files in this folder. 我一直在研究一种服务(根据文件大小)该文件夹中PDF文件的优化。
Each user may be uploading around 400 files, weighing anywhere between 80K to 10M, and my task is to optimize all of them to the smallest possible file size with minimal quality lose. 每个用户可能正在上传大约400个文件,文件大小在80K到10M之间,我的任务是将所有文件优化到尽可能小的文件大小,并且质量损失最小。

the PDF Library is doing a great job with it. PDF库在此方面做得很好。 My only problem is that I can't remove all embedded fonts from all files, since some of the files might use these fonts and the result would be a file that I can't use. 我唯一的问题是我无法从所有文件中删除所有嵌入的字体,因为某些文件可能会使用这些字体,结果将是我无法使用的文件。

So my questions are: 所以我的问题是:

  1. How can I detect what files use and what files doesn't use embedded fonts? 如何检测使用哪些文件和哪些文件不使用嵌入字体?
  2. When optimizing the files that use embedded fonts, How can I remove only the unused fonts? 优化使用嵌入字体的文件时,如何仅删除未使用的字体?

what I want to achieve is to remove all embedded fonts from most of the files, but keep the embedded fonts in the files where I actually need them. 我要实现的是从大多数文件中删除所有嵌入字体,但是将嵌入字体保留在我实际需要它们的文件中。 I understand that it depends on the fonts I have on my system (these files should stay on a single system so portability is not that important to me), so I try to find a way to identify, before optimizing, what files will look OK without embedded fonts, and what files I need to keep the embedded fonts. 我了解这取决于我系统上的字体(这些文件应保留在单个系统上,因此可移植性对我而言并不那么重要),因此我尝试找到一种方法来在优化之前确定哪些文件看起来可以没有嵌入字体,以及需要哪些文件来保留嵌入字体。

APDFL has a PDFontIsEmbedded() call. APDFL有一个PDFontIsEmbedded()调用。 The DotNet interface's Font class has an Embedded property. DotNet接口的Font类具有Embedded属性。 Saving with the GarbageCollect SaveFlag should remove any unreferenced indirect objects, including fonts. 使用GarbageCollect SaveFlag保存应删除所有未引用的间接对象,包括字体。

Note that Resource Dictionaries could potentially be shared by multiple pages so that fonts not used by one page might be used by another page that uses the same resource dictionary. 请注意,资源字典可能会被多个页面共享,因此,一个页面未使用的字体可能会被使用相同资源字典的另一页面使用。

The Adobe PDF Library version 15 and up have a service that will optimize PDF files for you. Adobe PDF Library版本15及更高版本提供一项服务,可为您优化PDF文件。

The Optimizer has a function to subset all embedded fonts. 优化器具有对所有嵌入字体进行子集化的功能。 What that will do is create a subset of each font limited to only the glyphs of that font actually used by the document. 要做的是为每种字体创建一个子集,该子集仅限于文档实际使用的该字体的字形。 The API is below. API在下面。

void Datalogics::PDFL::PDFOptimizer::SetOption (OptimizerOption option, bool value)
void Datalogics::PDFL::PDFOptimizer::Optimize (Document document, string newPath)

This is the option that you need 这是您需要的选项

SubsetAllEmbeddedFonts 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM