[英]Add new column with one value
I have the following dataframe: 我有以下数据帧:
a = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]], columns=['a','b','c'])
a
Out[234]:
a b c
0 1 2 3
1 4 5 6
2 7 8 9
3 10 11 12
I want to add a column with only the last row as the mean of the last 2 values of column c
. 我想添加一个只有最后一行的列作为列
c
的最后2个值的平均值。 Something like: 就像是:
a b c d
0 1 2 3 NaN
1 4 5 6 NaN
2 7 8 9 NaN
3 10 11 12 mean(9,12)
I tried this but the first part gives an error: 我试过了,但第一部分给出了一个错误:
a['d'].iloc[-1] = a.c.iloc[-2:].values.mean()
You can use .at
to assign at a single row/column label pair: 您可以使用
.at
在单个行/列标签对上进行分配:
ix = a.shape[0]
a.at[ix-1,'d'] = a.loc[ix-2:ix, 'c'].values.mean()
a b c d
0 1 2 3 NaN
1 4 5 6 NaN
2 7 8 9 NaN
3 10 11 12 10.5
Also note that chained indexing (what you're doing with aciloc[-2:]
) is explicitly discouraged in the docs, given that pandas sees these operations as separate events, namely two separate calls to __getitem__
, rather than a single call using a nested tuple of slices. 还要注意链接索引 (你正在用
aciloc[-2:]
做什么)在文档中明确不鼓励,因为pandas将这些操作视为单独的事件,即对__getitem__
两次单独调用,而不是使用a的单个调用。嵌套的切片元组。
You may set d
column beforehand (to ensure assignment): 您可以预先设置
d
列(以确保分配):
In [100]: a['d'] = np.nan
In [101]: a['d'].iloc[-1] = a.c.iloc[-2:].mean()
In [102]: a
Out[102]:
a b c d
0 1 2 3 NaN
1 4 5 6 NaN
2 7 8 9 NaN
3 10 11 12 10.5
We can use .loc
, .iloc
& np.mean
我们可以使用
.loc
, .iloc
和np.mean
a.loc[a.index.max(), 'd'] = np.mean(a.iloc[-2:, 2])
a b c d
0 1 2 3 NaN
1 4 5 6 NaN
2 7 8 9 NaN
3 10 11 12 10.5
Or just using .loc
and np.mean
: 或者只使用
.loc
和np.mean
:
a.loc[a.index.max(), 'd'] = np.mean(a.loc[a.index.max()-1:, 'c'])
a b c d
0 1 2 3 NaN
1 4 5 6 NaN
2 7 8 9 NaN
3 10 11 12 10.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.