简体   繁体   English

使用循环Python熊猫将数据框分为单个数据框

[英]Subsetting a Dataframe into Individual Dataframes using a Loop Python Pandas

I want to subset a dataframe into individual dataframes. 我想将一个数据帧子集化为单个数据帧。

So: 所以:

df:

     name    color   value
      joe     yellow   7.0
      mary    green    9.0
      pete    blue     8.0
      mary     red     8.8
      pete     blue    7.7
      joe     orange   2.0

I want to get: 我想得到:

df_joe

         name    color   value
      joe     yellow   7.0
      joe     orange   2.0

df_mary

     name    color   value
      mary    green    9.0
      mary     red     8.8

df_pete

     name    color   value
      pete    blue     8.0
      pete     blue    7.7

This is easy enough to do individually and manually. 这很容易单独和手动执行。 But I want to automate it in a loop or using `groupby'. 但是我想以循环或使用`groupby'使其自动化。 There are lots of related answers on how to get this information but none I have found discusses saving the broken out information to several dataframes. 关于如何获取此信息有很多相关的答案,但是我没有找到关于将细分的信息保存到几个数据帧的讨论。

SO ACTUALLY THIS IS NOT A DUPLICATE QUESTION BECAUSE OF THE FOLLOWING: 因此,由于以下原因,实际上这不是重复的问题:

I have tried to loop something like this: 我试图循环这样的事情:

names = ['joe','pete','mary']
for name in names
   'df_' + name = df[df['Name'] == name]

But I get an error assigning the dataframe subset to the newly constructed name. 但是我在将数据帧子集分配给新构造的名称时遇到错误。

How can I do this? 我怎样才能做到这一点?

Best is here create dictionary of DataFrames by groupby object: 最好是在这里通过groupby对象创建dictionary of DataFrames

dfs = dict(tuple(df.groupby('name')))
print (dfs)
{'joe':   name   color  value
0  joe  yellow    7.0
5  joe  orange    2.0, 'pete':    name color  value
2  pete  blue    8.0
4  pete  blue    7.7, 'mary':    name  color  value
1  mary  green    9.0
3  mary    red    8.8}

print (dfs['mary'])
   name  color  value
1  mary  green    9.0
3  mary    red    8.8

But if really need variables by strings (not recommended but possible): 但是如果确实需要字符串变量(不推荐,但可能的话):

for name, df in df.groupby('name'):
   globals()['df_' + name] = df

print (df_mary)
   name  color  value
1  mary  green    9.0
3  mary    red    8.8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM