简体   繁体   English

遍历压缩在文件夹中的特定文件,并根据在 Python 中找到的文本/字符串移动它们

[英]iterate through specific files zipped in a folder and move them based on text/string found in Python

I have multiple zipped files that I need to identify a string within the specific.html.我有多个压缩文件,我需要在 specific.html 中识别一个字符串。 All of the.html that I need to read end with the last 7 characters of 'bb.html'.我需要阅读的所有 .html 都以“bb.html”的最后 7 个字符结尾。

My goal is to move the whole.zip file if the html within contains the string/word.如果其中的 html 包含字符串/单词,我的目标是移动整个 .zip 文件。

I have this code written which works on the file that is listed but I need to iterate through thousands of zipped files.我编写了这段代码,它适用于列出的文件,但我需要遍历数千个压缩文件。 It doesn't have to be written as a function.它不必写成 function。

import os
import zipfile

        def check_files():
            os.listdir(source_folder)
            zip = zipfile.ZipFile(source_file3)
            file = zip.read("bb.html")
            if b'word' in file:
                shutil.copy(source_file3, source_folder2)
                print('word found-file moved')
            else:
                print('word not found')

most of the help I find iterates over the files inside, I need to iterate over ALL the.zip files and read into each bb.html file only.我找到的大部分帮助都会遍历内部文件,我需要遍历所有 .zip 文件并只读入每个 bb.html 文件。

I am new to Python so I have that as a challenge as well.我是 Python 的新手,所以我也有挑战。

Thanks in advance.提前致谢。

Thanks so much for the answers:!!!非常感谢您的回答:!!! FINAL CODE:最终代码:


    source_file3 = ('C:/Users/SMITH/Desktop/zipped/Message/testzip.zip')
    source_folder3 = (r'J:/server/zippedMessages')
    dest_folder = ('L:/_Mine/Zipped Messages Moved')


    def check_files():
        os.listdir(source_folder3)
        zip = zipfile.ZipFile(source_file3)
        file = zip.read("bb.html")
        if b'Health in file:
            shutil.copy(source_file3, dest_folder)
            print('word found-file moved')
        else:
            print('word not found')



    folderdir = source_folder3

    for filename in os.listdir(folderdir):
        if filename.endswith(".zip"):
            source_file3 = os.path.join(folderdir, filename)
            zip = zipfile.ZipFile(source_file3)
            check_files()

There should be many examples how to iterate files in folder (not in ZIP file)应该有很多例子如何迭代文件夹中的文件(不在 ZIP 文件中)


You should use for -loop with os.listdir() or glob.glob()您应该将for循环与os.listdir()glob.glob()一起使用

for filename in os.listdir(source_folder):
    if filename.endswith(".zip"):
        source_file3 = os.path.join(source_folder, filename)

        zip = zipfile.ZipFile(source_file3)

        # ... code ...
for source_file3 in glob.glob(f'{source_folder}/*.zip'):

    zip = zipfile.ZipFile(source_file3)

    # ... code ...

EDIT:编辑:

If you need to iterate files inside ZIP then use ZipFile.namelist() or ZipFile.infolist()如果您需要迭代 ZIP 内的文件,请使用ZipFile.namelist()ZipFile.infolist()

zip = zipfile.ZipFile(source_file3)

for inner_filename in zip.namelist():

    file = zip.read(inner_filename)

    # ... code ...

or或者

zip = zipfile.ZipFile(source_file3)

for inner_fileobject in zip.infolist():

    file = zip.read(inner_fileobject)

    # ... code ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 Python 中读取压缩文件夹中的文本文件 - How to read text files in a zipped folder in Python Python - 遍历文件夹中具有部分字符串匹配的文件 - Python - Iterate through files in folder with partial string matches 循环并加载 yaml 文件的压缩文件夹 - loop through and load a zipped folder of yaml files 如何遍历 zip 文件夹中的文件而不提取它们? - How to iterate through the files in a zip folder without extracting them? 遍历目录,直到找到特定的文件夹名称 - Iterate through directory until a specific folder name is found 如何使用Powershell遍历文件夹中的文件并在文件上运行特定的python代码? - How to loop through files in a folder and run a specific python code on them using powershell? Python 遍历文件,搜索特定字符串,如果找到,则复制这些行的 rest 并合并到一个组合文件中 - Python Iterate through files, Search for certain string, if found copy rest of the lines and consolidate to a combined file 遍历文件列表并使用python复制它们 - Iterate through a list of files and copy them using python 遍历文件夹并在python中一次访问两个文件 - Iterate through folder and access two files at once in python 尝试使用Glob遍历python文件夹中的文件 - Trying to use glob to iterate through files in a folder in python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM