[英]Edit a few lines of uncompressed PDF in Python
I want to edit a few lines in an uncompressed pdf.我想在未压缩的 pdf 中编辑几行。 I found a similar problem but since I need to scan the file a few times to get the exact line positions I want to change this doesn't really suit (and the pure number of RegEx matches are more than desired).我发现了一个类似的问题,但由于我需要扫描文件几次以获得我想要更改的确切行位置,这并不适合(并且 RegEx 匹配的纯数量超出了预期)。 The pdf contains utf-8 encodable lines (a few of them I want to edit, bookmark target ids in particular) and a lot of blobs (guess images and so on). pdf 包含 utf-8 可编码行(其中一些我想编辑,特别是书签目标 ID)和很多 blob(猜测图像等)。 When I edit the file with notepad it's working fine, but when I do it programatically (reading in, changing a few lines, writing back) images and some formatting is missing.当我用记事本编辑文件时,它工作正常,但是当我以编程方式(读入、更改几行、写回)时,图像和一些格式丢失了。 (Sine they are not read in at the firstplace, ignore-option) (因为他们一开始没有被读入,忽略选项)
with codecs.open("merged-uncompressed.pdf", "r", encoding='ascii', errors='ignore') as f:
I can read the file in with errors="surrogateescape"
and wanted to map the lines from above import but don't know if this approach can work.我可以使用errors="surrogateescape"
读取文件,并希望map 导入上面的行,但不知道这种方法是否可行。
Does anyone know a way how to deal with this?有谁知道如何处理这个问题?
Best, Lukas最好的,卢卡斯
I was able to solve this:我能够解决这个问题:
The code is very messy at the moment and so I don't want to publish it right now.目前代码非常混乱,所以我现在不想发布它。 But I want to add it at github within the next few weeks.但我想在接下来的几周内将它添加到 github。 If anyone needs it: just comment and it will have more priority.如果有人需要它:只需发表评论,它将具有更高的优先级。
Thanks to anyone who wanted to help:) Lukas感谢任何想提供帮助的人:) Lukas
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.