简体   繁体   English

Python:循环在python中打开多个文件夹和文件

[英]Python: Loop to open multiple folders and files in python

I am new to python and currently work on data analysis.我是python的新手,目前从事数据分析工作。

I am trying to open multiple folders in a loop and read all files in folders.我正在尝试循环打开多个文件夹并读取文件夹中的所有文件。 Ex.前任。 working directory contains 10 folders needed to open and each folder contains 10 files.工作目录包含10个需要打开的文件夹,每个文件夹包含10个文件。

My code for open each folder with.txt file;我用 .txt 文件打开每个文件夹的代码;

file_open = glob.glob("home/....../folder1/*.txt")

I want to open folder 1 and read all files, then go to folder 2 and read all files... until folder 10 and read all files.我想打开文件夹 1 并读取所有文件,然后转到文件夹 2 并读取所有文件...直到文件夹 10 并读取所有文件。 Can anyone help me how to write loop to open folder, included library needed to be used?任何人都可以帮助我如何编写循环来打开文件夹,包括需要使用的库吗?

I have my background in R, for example, in RI could write loop to open folders and files use code below.我有 R 的背景,例如,在 RI 中可以编写循环来打开文件夹和文件,使用下面的代码。

folder_open <- dir("......./main/")
for (n in 1 to length of (folder_open)){
    file_open <-dir(paste0("......./main/",folder_open[n]))

    for (k in 1 to length of (file_open){
        file_open<-readLines(paste0("...../main/",folder_open[n],"/",file_open[k]))
        //Finally I can read all folders and files.
    }
}

This recursive method will scan all directories within a given directory and then print the names of the txt files.这种递归方法将扫描给定目录中的所有目录,然后打印txt文件的名称。 I kindly invite you to take it forward.我诚挚地邀请您推动它。

import os

def scan_folder(parent):
    # iterate over all the files in directory 'parent'
    for file_name in os.listdir(parent):
        if file_name.endswith(".txt"):
            # if it's a txt file, print its name (or do whatever you want)
            print(file_name)
        else:
            current_path = "".join((parent, "/", file_name))
            if os.path.isdir(current_path):
                # if we're checking a sub-directory, recursively call this method
                scan_folder(current_path)

scan_folder("/example/path")  # Insert parent direcotry's path

Given the following folder/file tree:给定以下文件夹/文件树:

C:.
├───folder1
│       file1.txt
│       file2.txt
│       file3.csv
│
└───folder2
        file4.txt
        file5.txt
        file6.csv

The following code will recursively locate all .txt files in the tree:以下代码将递归地定位树中的所有.txt文件:

import os
import fnmatch

for path,dirs,files in os.walk('.'):
    for file in files:
        if fnmatch.fnmatch(file,'*.txt'):
            fullname = os.path.join(path,file)
            print(fullname)

Output:输出:

.\folder1\file1.txt
.\folder1\file2.txt
.\folder2\file4.txt
.\folder2\file5.txt

Your glob() pattern is almost correct.您的glob()模式几乎是正确的。 Try one of these:尝试其中之一:

file_open = glob.glob("home/....../*/*.txt")
file_open = glob.glob("home/....../folder*/*.txt")

The first one will examine all of the text files in any first-level subdirectory of home/...... , whatever that is.第一个将检查home/......的任何一级子目录中的所有文本文件,无论它是什么。 The second will limit itself to subdirectories named like "folder1", "folder2", etc.第二个将自己限制在名为“folder1”、“folder2”等的子目录中。

I don't speak R, but this might translate your code:我不会说 R,但这可能会翻译您的代码:

for filename in glob.glob("......../main/*/*.txt"):
    with open(filename) as file_handle:
        for line in file_handle:
            # perform data on each line of text

I think nice way to do that would be to use os.walk.我认为这样做的好方法是使用 os.walk。 That will generate tree and you can then iterate through that tree.这将生成树,然后您可以遍历该树。

import os
directory = './'
for d in os.walk(directory):
    print(d)

This code will look for all directories inside a directory, printing out the names of all files found there:此代码将查找目录内的所有目录,打印出在那里找到的所有文件的名称:

#--------*---------*---------*---------*---------*---------*---------*---------*
# Desc: print filenames one level down from starting folder
#--------*---------*---------*---------*---------*---------*---------*---------*

import os, fnmatch, sys

def find_dirs(directory, pattern):
    for item in os.listdir(directory):
        if os.path.isdir(os.path.join(directory, item)):
            if fnmatch.fnmatch(item, pattern):
                filename = os.path.join(directory, item)
                yield filename


def find_files(directory, pattern):
    for item in os.listdir(directory):
        if os.path.isfile(os.path.join(directory, item)):
            if fnmatch.fnmatch(item, pattern):
                filename = os.path.join(directory, item)
                yield filename



#--------*---------*---------*---------*---------*---------*---------*---------#
while True:#                       M A I N L I N E                             #
#--------*---------*---------*---------*---------*---------*---------*---------#
#                                  # Set directory
    os.chdir("C:\\Users\\Mike\\\Desktop")

    for filedir in find_dirs('.', '*'):
        print ('Got directory:', filedir)
        for filename in find_files(filedir, '*'):
            print (filename)

    sys.exit() # END PROGRAM      

pathlib is a good choose pathlib是一个不错的选择

from pathlib import Path

# or use: glob('**/*.txt')
for txt_path in [_ for _ in Path('demo/test_dir').rglob('*.txt') if _.is_file()]:
    print(txt_path.absolute())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM