[英]Pandas name dataframe from a string in csv name
I have several csv with a string in their name (eg city name) and want to read them in dataframe with the names derived from that city name.我有几个 csv 的名称中有一个字符串(例如城市名称),并希望在 dataframe 中读取它们,名称源自该城市名称。
example of csv names: data_paris.csv, data_berlin.csv csv 名称示例:data_paris.csv、data_berlin.csv
How can I read them in a loop to get df_paris and df_berlin?如何循环读取它们以获取 df_paris 和 df_berlin?
What I tried so far:到目前为止我尝试了什么:
all_files = glob.glob(./*.csv")
for filename in all_files:
city_name=re.split("[_.]", filename)[1] #to extract city name from filename
dfname= {'df' + str(city_name)}
print(dfname)
dfname= pd.read_csv(filename)
I expect to have df_rome and df_paris, but I get just dfname.我希望有 df_rome 和 df_paris,但我只得到 dfname。 Why?为什么?
A related question: Name a dataframe based on csv file name?一个相关问题: 根据 csv 文件名命名 dataframe?
Thank you!谢谢!
I would recommend against automatic dynamic naming like df_paris
, df_berlin
.我建议不要像df_paris
, df_berlin
这样的自动动态命名。 Instead, you should do:相反,您应该这样做:
all_files = glob.glob("./*.csv")
# dictionary of dataframes
dfs = dict()
for filename in all_files:
city_name=re.split("[_.]", filename)[1] # to extract city name from filename
dfs[city_name] = pd.read_csv(filename) # assign to the dataframe dictionary
You are mixing your concepts.你正在混合你的概念。 If you want to reference dynamically data frames that have been loaded use a dict
如果要动态引用已加载的数据帧,请使用dict
all_files = glob.glob("./*.csv")
dfname={}
for filename in all_files:
city_name=re.split("[_.]", filename)[1] #to extract city name from filename
dfname['df' + str(city_name)] = pd.read_csv(filename)
print(list(dfname.keys())
the only dataframe you're creating is "dfname."您正在创建的唯一 dataframe 是“dfname”。 You just keep overwriting that each time you loop through.您只需在每次循环时继续覆盖它。 I guess you could do this using globals(), though honestly I'd probably just create a list or a dict of dataframes (as it seems others have suggested while I was typing this), or else create a named column for 'city' in a master dataframe that I just keep appending to.我想你可以使用 globals() 来做到这一点,但老实说,我可能只是创建一个列表或数据框的字典(似乎其他人在我输入这个时建议),或者为“城市”创建一个命名列在我不断附加的主 dataframe 中。 But, keeping with what you're specifically asking, you could probably do it like so:但是,根据您的具体要求,您可能可以这样做:
all_files = glob.glob("./*.csv")
for filename in all_files:
globals()[filename[5:-4]]= pd.read_csv(filename)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.