简体   繁体   English

如何在 python 中打开具有多个数据帧的文件夹并合并到一个 csv 文件中

[英]how open folder with multiple dataframes in python and merge into one csv file

how open folder multiple df python in merge all in one csv file如何在一个 csv 文件中打开多个 df python 文件夹

I've around 700 csv files all have exacly the same columns, I need to merge all into one csv file.我有大约 700 个 csv 文件都有完全相同的列,我需要将所有文件合并到一个 csv 文件中。

that is the data, it is all in one folder, there is a pattern in file name, it is like "date" = ex: 07 25 2018那就是数据,都在一个文件夹中,文件名有一个模式,就像“日期” = ex: 07 25 2018

07252018 = {name: "Carlos", age:"30", height: "15" }

name     age   height
Carlos   30    15



07262018 = {name: "Carlos", age:"30", height: "15" }

name     age   height
Carlos   30    15



and etc.. range of 700csv

what I done..我做了什么。。

  • it works, but is very manual, needs alot of typing, since there are 700 csv's它可以工作,但非常手动,需要大量输入,因为有 700 个 csv

03012018 = pd.read_csv("Data/03012018 )
03022018 = pd.read_csv("Data/03012018 )
03032018 = pd.read_csv("Data/03012018 )
03042018 = pd.read_csv("Data/03012018 )
03052018 = pd.read_csv("Data/03012018 )
and etc..



file = pd.cancat([03012018,03022018,03032018,03042018,03052018 ])

file.to_csv("Data/file")


Expected output will be a optimal way, to do it fast without alot of typing.预计 output 将是一种最佳方式,无需大量打字即可快速完成。

IIUC, this should do: IIUC,这应该这样做:

Option 1:选项1:

Less efficient, more readable:效率较低,可读性更好:

def get_df():
    df=pd.DataFrame()
    for file in os.listdir():
        if file.endswith('.csv'):
            aux=pd.read_csv(file)
            df=df.append(aux)
    return df

And then:接着:

df=get_df()

Option 2:选项 2:

More memory efficient, less readable: memory 效率更高,可读性更低:

def df_generator():

    for file in os.listdir():
        if file.endswith('.csv'):
            aux=pd.read_csv(file)
            yield aux

And then:接着:

generator=df_generator()
df = pd.DataFrame()
for table in generator:
    df = df.append(table)

Note: for this to work as is, the script has to be INSIDE the folder with the csv's.注意:要使其按原样工作,脚本必须位于包含 csv 的文件夹内。 Else, you'll need to add the relative path to that folder from the folder your script will be in.否则,您需要从脚本所在的文件夹中添加该文件夹的相对路径。

Example: If your script is in the folder "Project" and inside that folder you have the folder "Tables" with all your csv's, you do:示例:如果您的脚本位于文件夹“Project”中,并且在该文件夹中,您的文件夹“Tables”中包含所有 csv,您可以:

os.listdir('Tables/')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在一个文件python中合并多个数据帧 - how to merge multiple dataframes inside one file, python 如何使用 python 在文件夹中合并多个.csv - How to merge multiple .csv in a folder using python 如何在Python中将多个文本文件合并为一个csv文件 - How to merge multiple text files into one csv file in Python 如何将多个 CSV 文件合并到一个文件中,并使用 python 在最终 CSV 文件中创建超级模式 - how to merge multiple CSV files into one file and create super schema in final CSV file using python 如何从文件夹中打开多个 JSON 文件并将它们合并到 python 中的单个 JSON 文件中? - How to open multiple JSON file from folder and merge them in single JSON file in python? 如何将多个不同语言的 CSV 文件合并为一个 CSV 文件? - How to merge multiple CSV files with different languages into one CSV file? 如何将文件夹中的多个 CSV 文件合并为 Azure 上的单个文件? - How to merge multiple CSV files in a folder to a single file on Azure? 使用python pandas将csv文件中的多行合并为一行 - Merge multiple rows to one row in a csv file using python pandas 如何将Python Dask Dataframes合并到列中? - How to merge Python Dask Dataframes into one on columns? 如何将多个 csv 文件合并到一个文件中,其中 pandas、python 上有特定列? - How to merge multiple csv files into one file with specific columns on pandas, python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM