[英]Rupee symbol(₹) in pdf is missing after upload to S3
I have a pdf which has rupee symbol(₹) in it.我有一个 pdf,里面有卢比符号 (₹)。 I am using aws-sdk with nodejs to upload the pdf to s3.
我正在使用 aws-sdk 和 nodejs 将 pdf 上传到 s3。 Rupee symbol is missing after uploading to s3.
上传到 s3 后卢比符号丢失。
In local, while I upload, it is working fine.在本地,当我上传时,它工作正常。 Where is eks, rupee symbol is missing in the pdf. Same behaviour is happening while i upload a file using apigateway to s3
eks 在哪里,pdf 中缺少卢比符号。当我使用 apigateway 将文件上传到 s3 时,发生了相同的行为
Thank you谢谢
const content = fs.readFileSync(filePath); const uploadToS3UsingSdk = async (bucket, key, content) => { return new Promise((resolve, reject) => { const awsConfig = { accessKeyId: process.env.accessKeyId, secretAccessKey: process.env.secretAccessKey, region: process.env.region, apiVersion: "2006-03-01", }; const s3 = new AWS.S3(awsConfig); const uploadParams = { Bucket: bucket, Key: key, Body: content, ContentType: "application/pdf;charset=utf-8", }; s3.upload(uploadParams, function (err, data) { if (err) { console.log("Error", err); return reject({ isSuccess: false, errorMessage: err.errorMessage, status: 500, }); } if (data) { console.log("Upload Success", data.Location); return resolve({ isSuccess: true, errorMessage: null, }); } }); }); }; <:-- begin snippet: js hide: false console: true babel: false -->
PDF is not a wysiwyg (what you see is what you get) format. PDF 不是所见即所得(所见即所得)格式。 Internally, it contains rendering instructions that tell a viewer (such as adobe reader) how to build the page.
在内部,它包含渲染指令,告诉查看者(如 adobe reader)如何构建页面。
Your document might contain something like:您的文档可能包含以下内容:
A PDF will also contain a so called resource dictionary, which clarifies which font F1
is. PDF 还将包含一个所谓的资源字典,它阐明了
F1
是哪种字体。
This is where it might go wrong.这可能是go 错误的地方。
The PDF specification (ISO32000) defines a handful of fonts as special (standard type 1 fonts). PDF 规范 (ISO32000) 将少数 fonts 定义为特殊(标准 1 型字体)。 These fonts should always be present in the reader.
这些 fonts 应该始终存在于阅读器中。
They include:他们包括:
When a piece of software builds a PDF it has 2 options:当一个软件构建一个 PDF 时,它有两个选项:
If option 1 is selected, you are bound to those characters that are defined in the standard fonts. Not every font contains every character (for instance, none of the standard 14 contains chines characters)如果选择选项 1,您将绑定到标准 fonts 中定义的那些字符。并非每种字体都包含所有字符(例如,标准 14 中没有一个包含中文字符)
If option 2 is selected, the font-file is embedded either in its entirety or partially in the PDF.如果选择选项 2,则字体文件将全部或部分嵌入 PDF。
Partially embedded fonts are called subset fonts. This is a feature typically used when the font is large (contains a lot of characters) but the PDF doesn't use all those characters.部分嵌入的 fonts 称为子集 fonts。这是通常在字体较大(包含很多字符)但 PDF 不使用所有这些字符时使用的功能。
To put it simply, if the PDF only contains the text "Hello World", then there is no point in adding information on how to render the character "A".简单来说,如果 PDF 只包含文本“Hello World”,那么添加有关如何呈现字符“A”的信息是没有意义的。
These are possible things that might be wrong with your PDF:以下是您的 PDF 可能存在的问题:
There is an online tool to validate PDF documents.有一个在线工具可以验证 PDF 文档。 It's called VeraPDF.
它叫做 VeraPDF。 You can find it here .
你可以在这里找到它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.