![](/img/trans.png)
[英]How to insert values of a new column for some specific rows in Dataframe?
[英]How to insert a new column into a dataframe and access rows with different indices?
我有一个带有一列“数字”的 dataframe,我想添加第二列“结果”。 这些值应该是“数字”列中前两个值的总和,否则为 NaN。
import pandas as pd
import numpy as np
data = {
"Numbers": [100,200,400,0]
}
df = pd.DataFrame(data,index = ["whatever1", "whatever2", "whatever3", "whatever4"])
def add_prev_two_elems_to_DF(df):
numbers = "Numbers" # alias
result = "Result" # alias
df[result] = np.nan # empty column
result_index = list(df.columns).index(result)
for i in range(len(df)):
#row = df.iloc[i]
if i < 2: df.iloc[i,result_index] = np.nan
else: df.iloc[i,result_index] = df.iloc[i-1][numbers] + df.iloc[i-2][numbers]
add_prev_two_elems_to_DF(df)
display(df)
output 是:
Numbers Result
whatever1 100 NaN
whatever2 200 NaN
whatever3 400 300.0
whatever4 0 600.0
但这看起来相当复杂。 这可以更容易、更快地完成吗? 我不是在寻找 sum() 的解决方案。 我想要一个适用于任何类型的 function 的通用解决方案,它可以使用其他行的值填充列。
编辑 1:我忘记导入 numpy。
编辑 2:我将一行更改为:
如果我 < 2: df.iloc[i,result_index] = np.nan
看起来您可以将rolling.sum
与shift
一起使用。 由于rollling.sum
求和到当前行,我们必须将其向下移动一行,以便每一行值与前两行的总和相匹配:
df['Result'] = df['Numbers'].rolling(2).sum().shift()
Output:
Numbers Result
whatever1 100 NaN
whatever2 200 NaN
whatever3 400 300.0
whatever4 0 600.0
这是我可以开发的最短代码。 它输出完全相同的表。
import numpy as np
import pandas as pd
#import swifter # apply() gets swifter
data = {
"Numbers": [100,200,400,0]
}
df = pd.DataFrame(data,index = ["whatever1", "whatever2", "whatever3", "whatever4"])
def func(a: np.ndarray) -> float: # we expect 3 elements, but we don't check that
a.reset_index(inplace=True,drop=True) # the index now starts with 0, 1,...
return a[0] + a[1] # we use the first two elements, the 3rd is unnecessary
df["Result"] = df["Numbers"].rolling(3).apply(func)
#df["Result"] = df["Numbers"].swifter.rolling(3).apply(func)
display(df)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.