简体   繁体   English

根据python中的现有数据帧列创建多个新数据帧

[英]create multiple new dataframes based on an existing data frames column in python

I have a pandas data frame, df , that has 4 columns and a lot of rows. 我有一个熊猫数据框df ,它有4列和很多行。

I want to create 5 different data frames based on the value of one of the columns of the data frame. 我想基于数据框架的列之一的值创建5个不同的数据框架。 The column I am referring to is called color . 我所指的列称为color

color has 5 unique values: red , blue , green , yellow , orange . color具有5个唯一值: redbluegreenyelloworange

What I want to do is each of the 5 new data frames should contain all rows which have on of the values in color . 我想做的是5个新数据框中的每一个都应包含所有具有color值的行。 For instance df_blue should have all the rows and columns where in the other data frame the value from the color column is blue. 例如, df_blue应该具有所有行和列,而在其他数据框中, color列中的值是蓝色。

The code I have is the following: 我的代码如下:

# create 5 new data frames
df_red = []
df_blue= []
df_green= []
df_yellow= []
df_orange= []
for i in range(len(df)):
    if df['color'] == "blue"
       df_blue.append(df)

# i would do if-else statements to satisfy all 5 colors

I feel I am missing some logic...any suggestions or comments? 我觉得我缺少一些逻辑...有什么建议或意见吗?

Thanks! 谢谢!

You need to use groupby . 您需要使用groupby The following code fragment creates a sample DataFrame and converts it into a dictionary where colors are keys and the matching dataframes are values: 下面的代码片段创建一个示例DataFrame并将其转换为字典,其中颜色是键,而匹配的数据帧是值:

df = pd.DataFrame({'color': ['red','blue','red','green','blue'],
                   'foo': [1,2,3,4,5]})
colors = {color: dfc for color,dfc in df.groupby('color')}
#{'blue':   color  foo
#         1  blue    2
#         4  blue    5, 
# 'green':    color  foo
#          3  green    4, 
# 'red':   color  foo
#        0   red    1
#        2   red    3}

我最终对每种颜色都做了这个。

  blue_data = data[data.color =='blue']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM