简体   繁体   English

python-pptx:处理受密码保护的 PowerPoint 文件

[英]python-pptx: Dealing with password-protected PowerPoint files

I'm using a slightly modified version of the "Extract all text from slides in presentation" example at https://python-pptx.readthedocs.io/en/latest/user/quickstart.html to extract text from some PowerPoint slides.我正在使用https://python-pptx.readthedocs.io/en/latest/user/quickstart.html中“从演示文稿中的幻灯片中提取所有文本”示例的略微修改版本,从一些 PowerPoint 幻灯片中提取文本。

I'm getting a PackageNotFoundError when I try to use the Presentation() method to open some of the PowerPoint files to read the text.当我尝试使用 Presentation() 方法打开某些 PowerPoint 文件以阅读文本时,出现 PackageNotFoundError。

This appears to be due to the fact that, unbeknownst to me before I started this project, a few of the PowerPoint files are password protected.这似乎是因为,在我开始这个项目之前,我并不知道一些 PowerPoint 文件受密码保护。

I obviously don't expect to be able to read text from a password-protected file but is there a recommended way of dealing with password-protected PowerPoint files?我显然不希望能够从受密码保护的文件中读取文本,但是是否有推荐的方法来处理受密码保护的 PowerPoint 文件? Having my Python script die every time it runs into one is annoying.让我的 Python 脚本每次遇到一个都死掉是很烦人的。

I'd be fine with something that basically went: "Hi. The file you're trying to read may be password-protected. Skipping."我对基本上是这样的东西没什么意见:“嗨。你试图阅读的文件可能受密码保护。跳过。”

I tried using a try/except block to catch the PackageNotFoundError but then I got "NameError: name 'PackageNotFoundError' is not defined".我尝试使用 try/except 块来捕获 PackageNotFoundError 但后来我得到“NameError:名称‘PackageNotFoundError’未定义”。

EDIT1: Here's a minimal case the generates the error: EDIT1:这是生成错误的最小情况:

EDIT2: See below for a working try/catch block, thanks to TheGamer007's suggestion. EDIT2:感谢 TheGamer007 的建议,请参阅下面的工作 try/catch 块。

import pptx
from pptx import Presentation

password_protected_file = r"C:\Users\J69401\Documents\password_protected_file.pptx"

prs = Presentation(password_protected_file)

And here's the error that is generated:这是生成的错误:

Traceback (most recent call last):
  File "T:/W/Wintermute/50 Sandbox/Pownall/Python/copy files/minimal_case_opening_file.py", line 6, in <module>
    prs = Presentation(password_protected_file)
  File "C:\Anaconda3\lib\site-packages\python_pptx-0.6.18-py3.6.egg\pptx\api.py", line 28, in Presentation
    presentation_part = Package.open(pptx).main_document_part
  File "C:\Anaconda3\lib\site-packages\python_pptx-0.6.18-py3.6.egg\pptx\opc\package.py", line 125, in open
    pkg_reader = PackageReader.from_file(pkg_file)
  File "C:\Anaconda3\lib\site-packages\python_pptx-0.6.18-py3.6.egg\pptx\opc\pkgreader.py", line 33, in from_file
    phys_reader = PhysPkgReader(pkg_file)
  File "C:\Anaconda3\lib\site-packages\python_pptx-0.6.18-py3.6.egg\pptx\opc\phys_pkg.py", line 32, in __new__
    raise PackageNotFoundError("Package not found at '%s'" % pkg_file)
pptx.exc.PackageNotFoundError: Package not found at 'C:\Users\J69401\Documents\password_protected_file.pptx'

Here's the minimal case again but with a working try/catch block.这又是最小的情况,但有一个有效的 try/catch 块。

import pptx
from pptx import Presentation
import pptx.exc
from pptx.exc import PackageNotFoundError

password_protected_file = r"C:\Users\J69401\Documents\password_protected_file.pptx"

try:
    prs = Presentation(password_protected_file)
except PackageNotFoundError:
    print("PackageNotFoundError generated - possible password-protected file.")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM