简体   繁体   English

将来自不同子文件夹的同名文件 csv 合并为一个 csv

[英]Combine csv files with same name from different subfolders in to one csv

I have three CSV files each for a particular filename for multiple files.我有三个 CSV 文件,每个文件用于多个文件的特定文件名。 Let's say there are a total 20 filenames so total 20* 3csv files in three different folders.假设总共有 20 个文件名,因此在三个不同的文件夹中总共有 20* 3csv 文件。

Folder A- 1001.CSV,1002.CSV,1003.CSV...文件夹 A- 1001.CSV,1002.CSV,1003.CSV...
Folder B-1001.CSV,1002.CSV,1003.CSV文件夹B-1001.CSV,1002.CSV,1003.CSV
Folder C-1001.csv,1002.csv,1003.csv......文件夹C-1001.csv,1002.csv,1003.csv……

I want to get a single CSV file for each 1001,1002,1003,1004..... So total 20csv files我想为每个 1001,1002,1003,1004 获得一个 CSV 文件..... 所以总共 20csv 文件

How can I do this?我怎样才能做到这一点? Since the files are in different folders glob is not working(or I don't know how to)由于文件位于不同的文件夹中,所以 glob 无法正常工作(或者我不知道如何操作)

I made the following assumptions: 我做了以下假设:

  • all the subfolders will be rooted at some known directory "parentdir" 所有子文件夹都将植根于某个已知目录“parentdir”
  • each subfolder contains only relevant csv files 每个子文件夹仅包含相关的csv文件
  • the csv files do not contain any header/footer lines csv文件不包含任何页眉/页脚行
  • each record in the csv files is separated by a newline csv文件中的每条记录都由换行符分隔
  • all of the records in each file are relevant 每个文件中的所有记录都是相关的

This should produce a "concat.csv" file in each subfolder with the contents of all the other files in that same folder. 这应该在每个子文件夹中生成一个“concat.csv”文件,其中包含该文件夹中所有其他文件的内容。 I used a snippet of code from this other answer on stackoverflow for actually concatenating the files. 在stackoverflow上使用了另一个代码片段来实际连接文件。

import os
import fileinput

rootdir = 'C:\\Users\\myname\\Desktop\\parentdir'
os.chdir(rootdir)
children = os.listdir()
for i in children:
    path = os.path.join(rootdir, i)
    os.chdir(path)
    filenames = os.listdir()
    with open('concat.csv', 'w') as fout, fileinput.input(filenames) as fin:
        for line in fin:
            fout.write(line + '\n')
import os
import shutil
import glob
import pandas as pd

path = '/mypath/'

# rename files
count = 1

for root, dirs, files in os.walk(path):
    for i in files:
        if i == 'whatever.csv':
            os.rename(os.path.join(root, i), os.path.join(root, "whatever" + str(count) + ".csv"))
            count += 1

# delete unwanted files
main_dir = path

folders = os.listdir(main_dir)

for (dirname, dirs, files) in os.walk(main_dir):
   for file in files:
      if file.startswith('dontwant'):
          source_file = os.path.join(dirname, file)
          os.remove(source_file)

# copy files to dir
for root, dirs, files in os.walk(path):  # replace the . with your starting directory
   for file in files:
       if file.endswith('.csv'):
          path_file = os.path.join(root,file)
          shutil.copy2(path_file,path) # change you destination dir

# combine files
os.chdir(path)
extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ])
combined_csv.to_csv( "combined_csv.csv", index=False, encoding='utf-8-sig')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用python将多个具有相同标题但不同csv文件名的CSV文件合并为一个文件 - merging multiple CSV files in one with same header but different csv files name with python 如何将来自不同 csv 文件的两列合并到一个 csv 文件中 - How to combine two columns from different csv files to one csv file 将多个 CSV 文件与相同的 Header 合并到不同的组文件中 - Combine multiple CSV files with Same Header into different group files 试图将来自两个不同的csv文件的列合并到python中的一个列表中 - Trying to combine a column from two different csv files into one list in python 如何访问相同子文件夹的所有子文件夹名称和包含的文件并制作 XLSX 或 CSV 文件? - How to access all the sub folders name and contained files of the same subfolders and make a XLSX or CSV file? 基于一列从一个 csv 中提取不同的 csv 文件 - Extraxt different csv files from one csv based on a column 如何合并保存在同一主文件夹中不同子文件夹中的2000 CSV文件 - How to merge 2000 CSV files saved in different subfolders within the same main folder 合并csv文件中的数据 - Combine data from csv files 将来自不同目录路径的多个.csv文件与python合并 - Combine multiple .csv files with python from different directory paths 在10个不同的子目录中合并多个具有相同名称的csv文件 - Merge multiple csv files with same name in 10 different subdirectory
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM