简体   繁体   English

从 Word Doc 中提取第 1 页 Header

[英]Extract Page 1 Header from Word Doc

I'm trying to extract multiple lines of text from the page 1 Header of an MS Word Document (.docx).我正在尝试从 MS Word 文档 (.docx) 的第 1 页 Header 中提取多行文本。 I'm using python.docx but can't determine how specific I need to get in order to get only the 1st page header.我正在使用 python.docx 但无法确定我需要获得多具体才能仅获得第一页 header。

Code is currently:代码目前是:

from docx import Document
document = Document("path.docx")
section = document.sections[0]
header = section.header
print(header.paragraphs[0].text)

With the output: "Name of File; Smith; Page"使用 output:“文件名;史密斯;页”

Screenshots linked for the content I'm referring to with Headers versus Running Header.我所指的内容的屏幕截图链接为 Headers 与 Running Header。 I want the Header, I don't care about the Running Header: Header 1 Running Header我想要 Header,我不关心 Running Header: Header 1 Running ZBF50D5E5361106F7AFE792

Any help appreciated!任何帮助表示赞赏! I've looked at the documentation for headers in general ( https://python-docx.readthedocs.io/en/latest/user/hdrftr.html ) but it does not go into specifics for dealing with the Different First Page Header feature of MS Word. I've looked at the documentation for headers in general ( https://python-docx.readthedocs.io/en/latest/user/hdrftr.html ) but it does not go into specifics for dealing with the Different First Page Header feature MS Word。

In Word, each section has three headers and three footers.在 Word 中,每个部分都有三个页眉和三个页脚。

They are not by page but there is the primary (odd-page) header, the even-page header, and the first-page header.它们不是按页排列的,但有主(奇数页)header、偶数页 header 和首页 header。

There is no Sections(0), the number starts with 1. Every document has at least one section.没有 Sections(0),数字从 1 开始。每个文档至少有一个 section。 Here is my web page on sections if you need more about them and headers and footers.如果您需要有关它们以及页眉和页脚的更多信息,这是我的 web 页面

The header on the first page will be either the first-page header of Section 1 or the primary header of Section 1. The code for the primary is Activedocument.Sections(1).Headers(wdHeaderFooterPrimary).Range.Text ; The header on the first page will be either the first-page header of Section 1 or the primary header of Section 1. The code for the primary is Activedocument.Sections(1).Headers(wdHeaderFooterPrimary).Range.Text ; that for the first-page is Activedocument.Sections(1).Headers(wdHeaderFooterFirstPage).Range.Text .第一页是Activedocument.Sections(1).Headers(wdHeaderFooterFirstPage).Range.Text

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM