[英]is there a way to measure margins of a pdf using python?
I've been using different python packages to parse PDFs, but I'm wondering if it's possible to measure the margins of a particular line in the document.我一直在使用不同的 python 包来解析 PDF,但我想知道是否可以测量文档中特定行的边距。 The measurement I would like is for it to be in pixels css-style, if possible.
如果可能的话,我想要的测量值是像素 css 样式。
It doesn't need to be so specific, just to figure out if a line is left-aligned, centered, or right-aligned based on margins, starting from left-to-right.它不需要那么具体,只是根据边距从左到右判断一条线是左对齐、居中还是右对齐。
Example:例子:
# margin <= x
left-aligned
# margin >= y && margin <= z
center-aligened
# margin >= z
right-aligned
Obviously this is just an example, but the margin differential will not be large, meaning, PDFs I'm parsing will likely have (in css terms):显然这只是一个例子,但边距差异不会很大,这意味着,我正在解析的 PDF 可能会有(以 css 的形式):
margin-left: 0
margin-left: x
margin-left: y
x, y
actual value are unimportant, the important thing is that they'll be consistent. x, y
的实际值并不重要,重要的是它们是一致的。
Sorry if this is confusing, the main thing I'm asking for is clarification or help in figuring out left-margin for every line in a pdf.抱歉,如果这令人困惑,我主要要求的是澄清或帮助计算 pdf 中每一行的左边距。
disclaimer: I am the author of borb
, the library used in this answer免责声明:我是
borb
的作者,这个答案中使用的库
You can SimpleLineOfTextExtraction
in borb
, which returns the lines of text in a PDF.您可以在
borb
中使用SimpleLineOfTextExtraction
,它返回 PDF 中的文本行。
You can check out this class here .您可以在此处查看此 class。
Each line has a content box (and a layout box), which can give you information about the location of that particular line of text.每行都有一个内容框(和一个布局框),它可以为您提供有关该特定文本行位置的信息。
You can use this to determine whether a line is left/right/middle aligned by comparing it to lines above/below it.您可以使用它来确定一条线是否左/右/中对齐,方法是将它与其上方/下方的线进行比较。
You can find an example of how to use this class here .您可以在此处找到有关如何使用此 class 的示例。
Essentially you open a document using the PDF.loads
method, passing along an EventListener
.本质上,您使用
PDF.loads
方法打开一个文档,并传递一个EventListener
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.