[英]Loop through rows of a dataframe, create a new column and store the result based on condition in another column
我有一個 df 如下:
Name Reference Efficiency
TargetA Yes 13
Target_1 No 12
Target_2 No 13
Target_3 No 10
Target_4 No 8
TargetB Yes 14
Target_4 No 12
Target_5 No 11
Target_6 No 10
TargetC Yes 15
Target_6 No 11
Target_7 No 13
Target_8 No 12
Target_9 No 14
Target_10 No 10
我想遍歷所有行,只要參考列中有“是”,它將創建另一個名為“檢查”的列,並從 13 中減去效率值(13、12、13、10、8)(這是相應的'Yes'的值。接下來它將從14中減去效率值(14,12,11,10)(這是下一個'Yes'對應的'Yes'值)等等。
預期 output:
Name Reference Efficiency Check
TargetA Yes 13 0
Target_1 No 12 1
Target_2 No 13 0
Target_3 No 10 3
Target_4 No 8 5
TargetB Yes 14 0
Target_4 No 12 2
Target_5 No 11 3
Target_6 No 10 4
TargetC Yes 15 0
Target_6 No 11 4
Target_7 No 13 2
Target_8 No 12 3
Target_9 No 14 1
Target_10 No 10 5
我嘗試了以下代碼:
for i, row in df.iterrows():
i = 0
val = row['Reference']
if val == 'Yes':
df['check'] = df.loc[i,'Efficiency'] - df['Efficiency'].shift(0)
我得到以下結果:
Name Reference Efficiency Check
0 TargetA Yes 13 0
1 Target_1 No 12 1
2 Target_2 No 13 0
3 Target_3 No 10 3
4 Target_4 No 8 5
5 TargetB Yes 14 -1
6 Target_4 No 12 1
7 Target_5 No 11 2
8 Target_6 No 10 3
9 TargetC Yes 15 -2
10 Target_6 No 11 2
11 Target_7 No 13 0
12 Target_8 No 12 1
13 Target_9 No 14 -1
14 Target_10 No 10 3
我在第一個“是”中得到了正確的結果請有人幫助我
創建一個輔助/輔助列,僅包含找到“是”的效率。 然后通過示例逐步將缺失值替換為之前的有效條目 go:
樣本數據:
import pandas as pd
data = {'Name': {0: 'TargetA',
1: 'Target_1',
2: 'Target_2',
3: 'Target_3',
4: 'Target_4',
5: 'TargetB',
6: 'Target_4',
7: 'Target_5',
8: 'Target_6',
9: 'TargetC',
10: 'Target_6',
11: 'Target_7',
12: 'Target_8',
13: 'Target_9',
14: 'Target_10'},
'Reference': {0: 'Yes',
1: 'No',
2: 'No',
3: 'No',
4: 'No',
5: 'Yes',
6: 'No',
7: 'No',
8: 'No',
9: 'Yes',
10: 'No',
11: 'No',
12: 'No',
13: 'No',
14: 'No'},
'Efficiency': {0: 13,
1: 12,
2: 13,
3: 10,
4: 8,
5: 14,
6: 12,
7: 11,
8: 10,
9: 15,
10: 11,
11: 13,
12: 12,
13: 14,
14: 10}}
df = pd.DataFrame(data)
代碼:
mask = df['Reference'].eq('Yes')
df['Check'] = pd.NA
df.loc[mask, 'Check'] = df['Efficiency'].loc[mask].copy()
df['Check'] = df['Check'].ffill()
df['Check'] = df['Check'] - df['Efficiency']
我還使用下面的代碼來創建一個輔助/幫助列,僅包含找到“是”的效率:
for i, row in df.iterrows():
val = row['Reference']
if val == 'Yes':
df['check'] = df[df['Reference']=='Yes']['Efficiency']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.