简体   繁体   English

如何将多个 zip 文件中的文件添加到单个 zip 文件中

[英]How to add files from multiple zip files into the single zip file

I want to put files in the multiple zip files that have common substring into a single zipfile我想将具有通用 substring 的多个 zip 文件中的文件放入单个 zipfile

I have a folder "temp" containing some.zip files and some other files我有一个文件夹“temp”,其中包含一些.zip 文件和其他一些文件

filename1_160645.zip
filename1_165056.zip
filename1_195326.zip
filename2_120528.zip
filename2_125518.zip
filename3_171518.zip
test.xlsx
filename19_161518.zip

I have following dataframe df_filenames containing the prefixes of filename我有以下 dataframe df_filenames 包含文件名的前缀

filename_prefix

filename1
filename2
filename3

if there are multiple.zip files in the temp folder with the same prefix that exists in the dataframe df_filenames,i want to merge the contents of those files如果临时文件夹中有多个.zip 文件,其前缀与 dataframe df_filenames 中存在的前缀相同,我想合并这些文件的内容

for example filename1_160645.zip contains following contents例如filename1_160645.zip包含以下内容

1a.csv
1b.csv

and filename1_165056.zip contains following contents并且filename1_165056.zip包含以下内容

1d.csv

and filename1_195326.zip contains following contents并且filename1_195326.zip包含以下内容

1f.csv

after merging the contents of above 2 files into the filename1_160645.zip the contents of filename1_160645.zip will be将上述2个文件的内容合并到filename1_160645.zip后, filename1_160645.zip的内容将是

1a.csv
1b.csv
1d.csv
1f.csv

At the end only following files will remain the temp folder最后只有以下文件将保留临时文件夹

filename1_160645.zip
filename2_120528.zip
filename3_171518.zip
test.xlsx
filename19_161518.zip

I have written the following code but it's not working我已经编写了以下代码,但它不工作


import os
import zipfile as zf
import pandas as pd

df_filenames=pd.read_excel('filename_prefix.xlsx')
#Get the list of all the filenames in the temp folder
lst_fnames=os.listdir(r'C:\Users\XYZ\Downloads\temp')
#take only .zip files
lst_fnames=[fname for fname in lst_fnames if fname.endswith('.zip')]

#take distinct prefixes in the dataframe
df_prefixes=df_filenames['filename_prefix'].unique()

for prefix in df_prefixes:
    #this list will contain zip files with the same prefixes
    lst=[]

    #total count of files in the lst
    count=0
    for fname in lst_fnames:
        if prefix in fname:
            #print(prefix)
            lst.append(fname)
            #print(lst)
    #if the list has more than 1 zip files,merge them
    if len(lst)>1:
        print(lst)
        with zf.ZipFile(lst[0], 'a') as f1:
            print(f1.filename)
            for f in lst[1:]:

                with zf.ZipFile(path+'\\'+f, 'r') as f:
                    print(f.filename) #getting entire path of the file here,not just filename
                    [f1.writestr(t[0], t[1].read()) for t in ((n, f.open(n)) for n in f.namelist())]
                    print(f1.namelist())

after merging the contents of the files with the filename containing filename1 into the filename1_160645.zip, the contents of ``filename1_160645.zip``` should be将文件名包含filename1的文件内容合并到filename1_160645.zip, ``filename1_160645.zip```的内容应该是

1a.csv
1b.csv
1d.csv
1f.csv

but nothing has changed when I double click filename1_160645.zip Basically, 1a.csv,1b.csv,1d.csv,1f.csv are not part of filename1_160645.zip but nothing has changed when I double click filename1_160645.zip Basically, 1a.csv,1b.csv,1d.csv,1f.csv are not part of filename1_160645.zip

I would use shutil for a higher level view for dealing with archive files.我会使用shutil来获得更高级别的视图来处理存档文件。 Additionally using pathlib gives nice methods/attributes for a given filepath.此外,使用pathlib为给定的文件路径提供了很好的方法/属性。 Combined with a groupby , we can easily extract target files that are related to each other.结合groupby ,我们可以轻松提取出相互关联的目标文件。

import itertools
import shutil
from pathlib import Path
import pandas as pd

filenames = pd.read_excel('filename_prefix.xlsx')
prefixes = filenames['filename_prefix'].unique()

path = Path.cwd()  # or change to Path('path/to/desired/dir/')
zip_files = (file for file in path.iterdir() if file.suffix == '.zip')
target_files = sorted(file for file in zip_files 
                      if any(file.stem.startswith(pre) for pre in prefixes))

file_groups = itertools.groupby(target_files, key=lambda x: x.stem.split('_')[0])
for _, group in file_groups:
    first, *rest = group
    if not rest:
        continue

    temp_dir = path / first.stem
    temp_dir.mkdir()

    shutil.unpack_archive(first, extract_dir=temp_dir)
    for item in rest:
        shutil.unpack_archive(item, extract_dir=temp_dir)
        item.unlink()

    shutil.make_archive(temp_dir, 'zip', temp_dir)
    shutil.rmtree(temp_dir)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将多个文件添加到单个zip文件夹中 - How to add multiple files into a single zip folder 在Python中将多个zip文件合并为一个zip文件 - Merge multiple zip files into a single zip file in Python 如何将 zip 文件拆分为多个特定大小的 zip 文件并将其提取为云存储中的单个文件? - How can I split a zip file into multiple zip files of specific size and extract it as a single file in cloud storage? 如何将 zip 文件拆分为多个有效的 zip 文件? - How to split zip file into multiple valid zip files? 如何从 url 中获取 zip 文件,然后返回 zip 文件作为 Z319C3206A7F10C1457C3BZ9116D 中的响应 - how to zip files from url and then return zip file as response in flask 如何将多个文件夹添加到zip文件中(将folder1,folder2添加到包含少量文件的myzip.zip filr中) - How to add multiple folders into a zip file (add folder1, folder2 to myzip.zip filr containing few files) 从Python上的多个文件创建加密的Zip文件 - Create encrypted Zip file from multiple files on Python 如何在 python 中将 zip 文件刮成单个 dataframe - How to scrape zip files into a single dataframe in python 如何从 zip 文件的多个文件夹访问多个 CSV 文件 - How to access multiple CSV files that share the same name from multiple folders from a zip file 使用 Python 将文件列表添加到 zip 文件中 - Using Python to add a list of files into a zip file
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM