![](/img/trans.png)
[英]Pandas dataframe if else condition based on previous rows not working
[英]Pandas dataframe if else condition based on previous rows not working
我有一个 pandas dataframe 如下:
df = pd.DataFrame({'X':[1,1,1, 0, 0]})
df
X
0 1
1 1
2 1
3 0
4 0
现在我想根据以下条件修改 X:
如果 X = 0,前一行 + 1 所以,我最终的 output 应该如下所示:
X
0 1
1 1
2 1
3 2
4 3
这可以通过迭代行并设置当前行和上一行并使用 iloc 来实现,并且按预期工作
for i in range(0, len(df)):
current_row = df.iloc[i]
if i > 0:
previous_row =df.iloc[i-1]
else:
previous_row = current_row
if (current_row['X'] == 0):
current_row['X'] = previous_row['X'] +1
我想要更有效的方法,我尝试了下面的代码,但 output 不是我所期望的(第 5 行的 X 值应该是 3):
conditions = [df["X"] == 0]
values = [df["X"] .shift() + 1]
df['X'] = np.select(conditions, values)
>>> df
X
0 1
1 1
2 1
3 2
4 1
你可以试试这个:
arr = df.X.values # extract the column as a numpy array for faster iteration
for i, val in enumerate(arr[1:], start=1):
if val == 0:
arr[i] = arr[i-1] + 1
您可以尝试以下方法:
import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [1, 1, 1, 0, 0]})
# values previous to zero
pe_zero = df.X.shift(-1).eq(0) * df.X # [0 0 1 0 0]
# 1 for reach zero value as you sum one to the previous value
eq_zero = df.X.eq(0)
# find consecutive groups of 0
groups = pe_zero + eq_zero
consecutive = (groups.gt(0) != groups.gt(0).shift()).cumsum()
# find cumulative sum by groups
cumulative = groups.groupby(consecutive).cumsum()
# choose from cumulative when equals to zero else from original
result = np.where(eq_zero, cumulative, df.X)
print(result)
Output
[1 1 1 2 3]
更新
对于df = pd.DataFrame({'X': [1, 1, 1, 0, 0, 1, 1, 0, 0]})
返回:
[1 1 1 2 3 1 1 2 3]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.