如何遍历 glob.glob 中的文件路径以一次创建多个文件？

Question

I have 10 different folder paths that I run this code through.我有 10 个不同的文件夹路径用于运行此代码。 Instead of changing them manually, I am trying to create a function to loop thru changing the file path to save time.我没有手动更改它们，而是尝试创建一个 function 来循环更改文件路径以节省时间。 Also, can you show me a way to disable glob.glob package overwriting the file?另外，你能告诉我一种禁用 glob.glob package 覆盖文件的方法吗？ For example, If I ran this code once, it creates one combined file of the folder path files.例如，如果我运行此代码一次，它会创建一个文件夹路径文件的组合文件。 If I run this twice (on accident), it duplicates the rows in the csv.如果我运行两次（意外），它会复制 csv 中的行。 For example, .csv1 has 100 rows after running code.例如，.csv1 在运行代码后有 100 行。 After running it twice, it has 200 rows and has a duplication of every row.运行两次后，它有 200 行，并且每行都有重复。 I am trying to write the code to overwrite the previous file and not have duplications because I store this in a server.我正在尝试编写代码来覆盖以前的文件并且没有重复，因为我将它存储在服务器中。

So I have 10 of these codes written out to go to separate file locations.因此，我将其中的 10 个代码写入 go 以分隔文件位置。 Instead of running them separately, I want to loop them through this code to create multiple files at once.我不想单独运行它们，而是想通过这段代码循环它们以一次创建多个文件。

# Change File Path to personal directory folder
os.chdir("C:/Users/File.csv")

extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]

# Using Pandas to combine all files in the list

#combine all files in the list
combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ])
#export to csv
combined_csv.to_csv( "File.csv", index=False, encoding='utf-8')

Answer 1

You should ignore File.csv when processing the list, so you don't append it to itself.在处理列表时，您应该忽略File.csv ，因此您不要将其 append 它自己。

import os

combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames if os.path.basename(f) != 'File.csv' ])

Answer 2

I would use os.walk().我会使用 os.walk()。

import os
import pandas as pd

source_dir = r'C:\Users\Documents\folder' # change to dir of choice

my_list = []

for root, dirnames, filenames in os.walk(source_dir):
    for f in filenames:
        if f.endswith('csv'):

            my_list.append(pd.read_csv(os.path.join(root, f)))

concatted_df = pd.concat(my_list)

如何遍历 glob.glob 中的文件路径以一次创建多个文件？

问题描述

2 个解决方案

解决方案1
1 2022-02-03 00:45:02

解决方案2
0 2022-02-03 00:43:00

如何遍历 glob.glob 中的文件路径以一次创建多个文件？

问题描述

2 个解决方案

解决方案1 1 2022-02-03 00:45:02

解决方案2 0 2022-02-03 00:43:00

解决方案1
1 2022-02-03 00:45:02

解决方案2
0 2022-02-03 00:43:00