简体   繁体   English

Pypdf2合并function并以

[英]Pypdf2 merger function and startswith

first time coder here.第一次在这里编码。 I'm trying to create a program to help automate some of my work in the office using python.我正在尝试使用 python 创建一个程序来帮助自动化我在办公室的一些工作。

what I'm trying to do is to merge pdf file from Folder 1, with another pdf file from folder 2 with the same name.我要做的是将文件夹 1 中的 pdf 文件与文件夹 2 中的另一个同名 pdf 文件合并。 I also would like to use Tkinter gui我也想使用 Tkinter gui

this is what I get so far这就是我到目前为止所得到的

from tkinter import *
from PyPDF2 import PdfFileMerger

root = Tk()

  
# Creating a Label Widget
MainLabel = Label(root, text="PDF Rawat Jalan")
# Shoving it onto the screen
MainLabel.pack()

#Prompt Kode
KodeLabel = Label(root, text="Masukan Kode")
KodeLabel.pack()

#Input Kode

kode = Entry(root, bg="gray",)
kode.pack()


#function of Merge Button
def mergerclick():
    kode1 = kode.get()
    pdflocation_1 = "C:\\Users\\User\\Desktop\\PDF\\Folder 1\\1_"+kode1+".pdf"
    pdflocation_2 = "C:\\Users\\User\\Desktop\\PDF\\Folder 2\\2_"+kode1+".pdf"
    Output = "C:\\Users\\User\\Desktop\\PDF\\output\\"+kode1+".pdf"
    merger = PdfFileMerger()

    merger.append(pdflocation_1)
    merger.append(pdflocation_2)

    merger.write(open(Output, 'wb'))
    confirmation = kode1 +" merged"
    testlabel = Label(root, text=confirmation)
    testlabel.pack()



#Merge Button
mergerButton = Button(root, text= "Merge", command=mergerclick)
mergerButton.pack()

root.mainloop()

Now there is a third file i'm supposed to append, but the third file i'm supposed to append has date in its file name.现在有第三个文件我应该是 append,但我应该是 append 的第三个文件的文件名中有日期。 for example: file 1 (010.pdf);例如:文件1(010.pdf); file 2 (010.pdf);文件 2 (010.pdf); file 3 (010_2020_10_05).文件 3 (010_2020_10_05)。

There is like 9000 file per folder How I'm supposed to do this?每个文件夹大约有 9000 个文件我应该怎么做?

I think what you need is a way to just find files prefixed with a particular string.我认为您需要的是一种仅查找以特定字符串为前缀的文件的方法。 Based on the date suffix I'm guessing the file names may not be unique so I'm writing this to find all matches.根据日期后缀,我猜文件名可能不是唯一的,所以我写这个来查找所有匹配项。 Something like this will do that:这样的事情会做到这一点:

import pathlib

def find_prefix_matches(prefix):
  dir_path = pathlib.Path(directory_name)
  return [str(f_name) for f_name in dir_path.iterdir() 
      if str(f_name).startswith(prefix)]

If you are just learning to write code, this example is relatively simple.如果你只是学习写代码,这个例子比较简单。 However it is not efficient if you need to match 9,000 files at the same time.但是,如果您需要同时匹配 9,000 个文件,则效率不高。 To make it run faster you'll want to load the file list once instead of per request.为了让它运行得更快,您需要加载文件列表一次,而不是每次请求。

import pathlib

def find_prefix_matches(prefix, file_list):
  return [f for f in file_list if f.startswith(prefix)]

file_list = [str(f_name) for f_name in dir_path.iterdir()]
for file_name_prefix in your_list_of_files_to_append:
  file_matches = find_prefix_matches(file_name_prefix, file_list)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM