[英]Read multiple .csv files from folder with variable part fo file name
I have a folder that contains a variable number of files, and each file has a variable string in the name. 我有一个包含可变数量文件的文件夹,并且每个文件的名称中都有一个可变字符串。 For example: 例如:
my_file V1.csv
my_file V2.csv
my_file something_else.csv
I would need to: 我需要:
Right now I am doing it with individual pd.read_csv functions for each file, and then merging them with a concatenate. 现在,我对每个文件使用单独的pd.read_csv函数,然后将它们与串联合并。
This is not optimal as every time the files in the source folder change, I need to modify the script. 这不是最佳选择,因为每次源文件夹中的文件更改时,我都需要修改脚本。
Is it possible to automate this process, so that it works even if the source files change? 是否可以自动执行此过程,以便即使源文件发生更改也可以运行?
You can combine glob
, pandas.concat
and pandas.read_csv
fairly easily. 您可以相当容易地组合glob
, pandas.concat
和pandas.read_csv
。 Assuming the CSV files are in the same folder as your script: 假设CSV文件与脚本位于同一文件夹中:
import glob
import pandas as pd
df = pd.concat([pd.read_csv(f) for f in glob.glob('my_file*.csv')])
for filename in os.listdir(directory):
if filename.startswith("my_file") and filename.endswith(".csv"):
# do some stuff here
continue
else:
continue
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.