繁体   English   中英

在堆叠的熊猫数据框中添加一列

[英]Add a column to a stacked pandas dataframe

我想在amax列旁边添加一个days列,并计算aminamax之间的天数差。

用这种方法创建新列失败

df["dates"]["diff"] = df["dates"]["amax"]-df["dates"]["amin"]

在这里,您可以看到我的数据框的示例。

我使用以下代码创建了数据框:

gb = stock_prices.groupby(['stock_name'])
df = gb.agg({'date' : [np.min, np.max]})

您可以重置列中的multiindex ,然后添加新列diff

print df
                     activity       date     name
0                       slept 2014-12-02     Elon
1                     tripped 2013-08-04     Bill
2                       spoke 2012-05-08    Larry
3                        swam 2015-04-11     Elon
4                     spooked 2014-12-09     Jeff
5                       liked 2009-10-23    Larry
6                    whistled 2013-09-21    Larry
7                      up dog 2011-01-02     Bill
8                      smiled 2013-07-28    Larry
9                     donated 2014-11-19     Elon
10  grant men paternity leave 2015-10-24  Marissa
11                    fondled 2013-08-24     Jeff
#aggregate to min and max date 
g = df.groupby(['name']).agg({'date' : [np.max, np.min]})
print g
              date           
              amax       amin
name                         
Bill    2013-08-04 2011-01-02
Elon    2015-04-11 2014-11-19
Jeff    2014-12-09 2013-08-24
Larry   2013-09-21 2009-10-23
Marissa 2015-10-24 2015-10-24

#reset columns multiindex
levels = g.columns.levels
labels = g.columns.labels
g.columns = levels[1][labels[1]]

g['diff'] = g['amax'] - g['amin']
print g
              amax       amin      diff
name                                   
Bill    2013-08-04 2011-01-02  945 days
Elon    2015-04-11 2014-11-19  143 days
Jeff    2014-12-09 2013-08-24  472 days
Larry   2013-09-21 2009-10-23 1429 days
Marissa 2015-10-24 2015-10-24    0 days

但是,如果您不想在列中重置multiindex ,请使用loc

print g
              date           
              amax       amin
name                         
Bill    2013-08-04 2011-01-02
Elon    2015-04-11 2014-11-19
Jeff    2014-12-09 2013-08-24
Larry   2013-09-21 2009-10-23
Marissa 2015-10-24 2015-10-24

g.loc[:, ('date', 'diff')] =  g.loc[:, ('date', 'amax')] - g.loc[:, ('date', 'amin')]
print g
              date                     
              amax       amin      diff
name                                   
Bill    2013-08-04 2011-01-02  945 days
Elon    2015-04-11 2014-11-19  143 days
Jeff    2014-12-09 2013-08-24  472 days
Larry   2013-09-21 2009-10-23 1429 days
Marissa 2015-10-24 2015-10-24    0 days

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM