[英]How to read directory and all files inside in Python
I am still new to python and I am making a simples application which is to extract text from ppt files.我还是 python 的新手,我正在制作一个简单的应用程序,用于从 ppt 文件中提取文本。
I have this project structure.我有这个项目结构。
> Project_Python
>> Files
>>> Class A
- History.ppt
>>> Class B
- Animals.ppt
>> Result
???
- main.py
My question is how can I read the files inside sub_folder of Class A
and Class B?
我的问题是如何读取
Class A
和Class B?
And I want it to automatically create the folder structure of Files
inside Result
after print我希望它在打印后自动在
Result
中创建Files
的文件夹结构
This is what i've tried这是我尝试过的
from pptx import Presentation
import glob
import pathlib
import os
p_temp = pathlib.Path('Files') //How can I read sub folders folder dynamically
for eachfile in glob.glob("**/*.pptx"):
prs = Presentation(eachfile)
print(eachfile)
print("----------------------")
textdata = []
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
textdata.append(shape.text)
print(''.join(textdata[1:]) , file=open("Result/"+eachfile+".txt" , "a")) //Create the same folder structure of Files
Your code is almost correct except usage of glob.glob.除了使用 glob.glob 之外,您的代码几乎是正确的。
You should also pass recursive=True parameter您还应该传递recursive=True参数
To create directory with subdirs you can use os.makedirs要创建带有子目录的目录,您可以使用os.makedirs
from pptx import Presentation
import glob
import pathlib
import os
p_temp = pathlib.Path('Files') //How can I read sub folders folder dynamically
for eachfile in glob.glob(p_temp+"**/*.pptx", recursive=True):
prs = Presentation(eachfile)
print(eachfile)
print("----------------------")
textdata = []
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
textdata.append(shape.text)
os.makedirs(str(pathlib.Path(eachfile).parent).replace('Files','Result')
print(''.join(textdata[1:]) , file=open("Result/"+eachfile+".txt" , "a")) //Create the same folder structure of Files
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.