简体   繁体   English

如何使用 pandas 导入多个 csv 文件并连接成一个 DataFrame

[英]How to import multiple csv files and concatenate into one DataFrame using pandas

I have problem No objects to concatenate .我有问题No objects to concatenate I can not import.csv files from main and its subdirectories to concatenate them into one DataFrame.我无法从主目录及其子目录中导入.csv 文件,以将它们连接成一个 DataFrame。 I am using pandas.我正在使用 pandas。 Old answers did not help me so please do not mark as duplicated.旧答案对我没有帮助,所以请不要标记为重复。

Folder structure is like that文件夹结构是这样的

main/*.csv
main/name1/name1/*.csv
main/name1/name2/*.csv
main/name2/name1/*.csv
main/name3/*.csv
import pandas as pd
import os
import glob

folder_selected = 'C:/Users/jacob/Documents/csv_files'
  1. not works不工作
frame = pd.concat(map(pd.read_csv, glob.iglob(os.path.join(folder_selected, "/*.csv"))))
  1. not works不工作
csv_paths = glob.glob('*.csv')
dfs = [pd.read_csv(folder_selected) for folder_selected in csv_paths]
df = pd.concat(dfs)
  1. not works不工作
            all_files = []
            
            all_files = glob.glob (folder_selected + "/*.csv")
            
            file_path = []
            for file in all_files:
                df = pd.read_csv(file, index_col=None, header=0)
                file_path.append(df)
                    
        frame = pd.concat(file_path, axis=0, ignore_index=False)

Check Dask Library as following, which reads many files to one df如下检查 Dask 库,它将许多文件读取到一个 df

>>> import dask.dataframe as dd
>>> df = dd.read_csv('data*.csv')

Read their docs https://examples.dask.org/dataframes/01-data-access.html#Read-CSV-files阅读他们的文档https://examples.dask.org/dataframes/01-data-access.html#Read-CSV-files

You need to search the subdirectories recursively.您需要递归搜索子目录。

folder = 'C:/Users/jacob/Documents/csv_files'
path = folder+"/**/*.csv"
  1. Using glob.iglob使用glob.iglob
df = pd.concat(map(pd.read_csv, glob.iglob(path, recursive=True)))
  1. Using glob.glob使用glob.glob
csv_paths = glob.glob(path, recursive=True)
dfs = [pd.read_csv(csv_path) for csv_path in csv_paths]
df = pd.concat(dfs)
  1. Using os.walk使用os.walk
file_paths = []
for base, dirs, files in os.walk(folder):
    for file in fnmatch.filter(files, '*.csv'):
        file_paths.append(os.path.join(base, file))
df = pd.concat([pd.read_csv(file) for file in file_paths])
  1. Using pathlib使用pathlib
from pathlib import Path
files = Path(folder).rglob('*.csv')
df = pd.concat(map(pd.read_csv, files))

Python's pathlib is a tool for such tasks Python 的pathlib是完成此类任务的工具

from pathlib import Path

FOLDER_SELECTED = 'C:/Users/jacob/Documents/csv_files'

path = Path(FOLDER_SELECTED) / Path("main")

# grab all csvs in main and subfolders
df = pd.concat([pd.read_csv(f.name) for f in path.rglob("*.csv")])

Note:笔记:

If the CSV need preprocing, you can create a read_csv function to deal with issues and place it in place of pd.read_csv如果 CSV 需要预处理,您可以创建一个 read_csv function 来处理问题并将其放置在 pd.read_csv 的位置

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将多个CSV文件导入pandas并拼接成一个DataFrame - Import multiple CSV files into pandas and concatenate into one DataFrame 无法将多个 csv 文件导入到 Pandas 中并在 Python 中连接为一个 DataFrame - Failed to import multiple csv files into pandas and concatenate into one DataFrame in Python 不完整 将多个 csv 文件导入 pandas 并拼接成一个 DataFrame - Not full Import multiple csv files into pandas and concatenate into one DataFrame 导入多个嵌套的csv文件并将其串联到一个DataFrame中 - Import multiple nested csv files and concatenate into one DataFrame 将多个 excel 文件导入 python pandas 并拼接成一个 Z6A8064B5DF4794555500553C4DC7 - Import multiple excel files into python pandas and concatenate them into one dataframe 按创建日期过滤多个 csv 文件并连接成一个 pandas DataFrame - Filtering multiple csv files by creation date and concatenate into one pandas DataFrame Import multiple csv files into pandas and concatenate into one DataFrame where 1st column same in all csv and no headers of data just file name - Import multiple csv files into pandas and concatenate into one DataFrame where 1st column same in all csv and no headers of data just file name Pandas:使用循环和分层索引将多个csv文件导入数据框 - Pandas: import multiple csv files into dataframe using a loop and hierarchical indexing 使用 Pandas 将多个 CSV 文件合并到一个数据框 - Merging multiple CSV files to one dataframe using Pandas 如何将多个csv文件连接到一个pandas数据框中,文件名作为行名? - How do I concatenate multiple csv files into a pandas dataframe, with the filenames as the row names?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM