简体   繁体   English

如何在 Python 中使用 pdf 管道工打开多个 pdf 文件?

[英]How to open multiple pdf filesusing pdf plumber in Python?

I have a folder full of pdf files.我有一个装满 pdf 文件的文件夹。 I need to iterate through each of the pdf files according to the given condition.我需要根据给定的条件遍历每个 pdf 文件。 Using pandas its not possible.使用 pandas 是不可能的。 Is there any method to iterate through each files using pdfplumber?有没有什么方法可以使用 pdfplumber 遍历每个文件?

import os
import glob
import shutil
import pandas as pd
import plotly.express as px
import xlrd
import matplotlib.pyplot as plt
%matplotlib inline
import time
from datetime import datetime
from pytz import timezone
import numpy as np
from tabula import read_pdf
import tabula
import requests
import pdfplumber
from tabula import read_pdf

glob.glob("C:/Users/Dreamer/Desktop/test_run/machine/*.pdf")



#THIS IS THE CONDITION I WANT TO IMPLEMENT IN EACH FILE

with pdfplumber.open(path) as pdf:
    page = pdf.pages[0]
    text = page.extract_text()

for row in text.split('\n'):
    if row.startswith('Raumtemperatur '):
        jobstart = row.split()[-2]
        jobend   = row.split()[-1]
print("jobstart", jobstart) 
print("jobstart", jobend)

Looking forward for solution or an alternative:)期待解决方案或替代方案:)

IIUC,国际大学联合会,

import glob
from tqdm.auto import tqdm
for current_pdf_file in tqdm(glob.glob("C:/Users/Dreamer/Desktop/test_run/machine/*.pdf")):
    with pdfplumber.open(current_pdf_file) as my_pdf:
         # do other things here?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM