[英]How do I create several df's out of one original df based on a condition and then assign them individual names
df_collection = {}
for country in country_names:
df_collection[country] = df.loc[df['CountryName'] == country].copy
I want to create several df's (about 70 for each country one) out of one original df (each country is differing in frequency) and then assign them individual names (therefore I used a dictionary).我想从一个原始df(每个国家的频率不同)中创建几个df(每个国家大约70个),然后为它们分配单独的名称(因此我使用了字典)。 But I can't access the individual df anymore.但我不能再访问个人 df 了。 They should have different names and should remain a data frame.它们应该有不同的名称,并且应该保持一个数据框。 error: 'method' object is not subscriptable错误:“方法”object 不可下标
Does anyone have a solution?有没有人有办法解决吗?
You assigned a method to each of your dictionary keys.您为每个字典键分配了一个方法。 You need to call copy with ()
, ie df.loc[df['CountryName'] == country].copy()
.您需要使用()
调用副本,即df.loc[df['CountryName'] == country].copy()
。
However there's no need to subset your DataFrame in a loop.但是,无需在循环中对 DataFrame 进行子集化。 This is exactly what groupby
is made for and you can create the dict succinctly with这正是groupby
的用途,您可以简洁地创建字典
df_collection = dict(tuple(df.groupby('CountryName')))
This works because the __iter__
method of a groupby object: "Returns: Generator yielding sequence of (name, subsetted object) for each group" so with a single grouping key, those values become the keys of your dictionary.这是因为 groupby object 的__iter__
方法:“返回:生成器为每个组产生(名称,子集对象)序列”,因此使用单个分组键,这些值成为字典的键。
print(df)
# CountryName Data
#0 a 8
#1 c 4
#2 b 4
#3 a 1
#4 a 1
#5 c 7
df_collection = dict(tuple(df.groupby('CountryName')))
## If you care for the subset defined in some list `country_names`, subset first
# df_collection = dict(tuple(df[df.CountryName.isin(country_names)].groupby('CountryName')))
df_collection['a']
# CountryName Data
#0 a 8
#3 a 1
#4 a 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.