循环遍历 dataframe 的行，创建一个新列并根据条件将结果存储在另一列中

Question

I have a df as follows:我有一个 df 如下：

Name    Reference   Efficiency
TargetA    Yes      13
Target_1    No      12
Target_2    No      13
Target_3    No      10
Target_4    No      8
TargetB     Yes     14
Target_4    No      12
Target_5    No      11
Target_6    No     10
TargetC     Yes    15
Target_6    No      11
Target_7    No      13
Target_8    No      12
Target_9    No      14
Target_10   No     10

I want to loop through all the rows, wherever there is 'Yes' in Reference column, it will create another column called 'Check' and subtract Efficiency values(13,12,13,10,8) from 13(which is the corresponding value of 'Yes'. Next it will subtract Efficiency values(14,12,11,10) from 14(which is the corresponding value of 'Yes' for the next 'Yes') and so on.我想遍历所有行，只要参考列中有“是”，它将创建另一个名为“检查”的列，并从 13 中减去效率值（13、12、13、10、8）（这是相应的'Yes'的值。接下来它将从14中减去效率值（14,12,11,10）（这是下一个'Yes'对应的'Yes'值）等等。

Expected output:预期 output：

Name    Reference   Efficiency  Check

TargetA    Yes            13    0
Target_1    No            12    1
Target_2    No            13    0
Target_3    No            10    3
Target_4    No             8    5
TargetB     Yes           14    0
Target_4    No            12    2
Target_5    No            11    3
Target_6    No            10    4
TargetC     Yes           15    0
Target_6    No            11    4
Target_7    No            13    2
Target_8    No            12    3
Target_9    No            14    1
Target_10   No            10    5

I have tried the following codes:我尝试了以下代码：

for i, row in df.iterrows():
    i = 0
    val = row['Reference']
    if val == 'Yes':
        df['check'] = df.loc[i,'Efficiency'] - df['Efficiency'].shift(0)

I got the following result:我得到以下结果：

Name    Reference   Efficiency  Check
0   TargetA     Yes           13    0
1   Target_1    No            12    1
2   Target_2    No            13    0
3   Target_3    No            10    3
4   Target_4    No             8    5
5   TargetB     Yes           14    -1
6   Target_4    No            12    1
7   Target_5    No            11    2
8   Target_6    No            10    3
9   TargetC     Yes           15    -2
10  Target_6    No            11    2
11  Target_7    No            13    0
12  Target_8    No            12    1
13  Target_9    No            14    -1
14  Target_10   No            10    3

I got the result correctly in the first 'Yes' Please can someone help me我在第一个“是”中得到了正确的结果请有人帮助我

Answer 1

Create an auxillary / helper column, only containing the Efficiencies where "Yes" was found.创建一个辅助/辅助列，仅包含找到“是”的效率。 Then replace missing values with the previous valid entries, go through the example step by step:然后通过示例逐步将缺失值替换为之前的有效条目 go：

Sample data:样本数据：

import pandas as pd
data = {'Name': {0: 'TargetA',
  1: 'Target_1',
  2: 'Target_2',
  3: 'Target_3',
  4: 'Target_4',
  5: 'TargetB',
  6: 'Target_4',
  7: 'Target_5',
  8: 'Target_6',
  9: 'TargetC',
  10: 'Target_6',
  11: 'Target_7',
  12: 'Target_8',
  13: 'Target_9',
  14: 'Target_10'},
 'Reference': {0: 'Yes',
  1: 'No',
  2: 'No',
  3: 'No',
  4: 'No',
  5: 'Yes',
  6: 'No',
  7: 'No',
  8: 'No',
  9: 'Yes',
  10: 'No',
  11: 'No',
  12: 'No',
  13: 'No',
  14: 'No'},
 'Efficiency': {0: 13,
  1: 12,
  2: 13,
  3: 10,
  4: 8,
  5: 14,
  6: 12,
  7: 11,
  8: 10,
  9: 15,
  10: 11,
  11: 13,
  12: 12,
  13: 14,
  14: 10}}
df = pd.DataFrame(data)

Code:代码：

mask = df['Reference'].eq('Yes')
df['Check'] = pd.NA
df.loc[mask, 'Check'] = df['Efficiency'].loc[mask].copy()
df['Check'] = df['Check'].ffill()
df['Check'] = df['Check'] - df['Efficiency']

Answer 2

I also used to code below to create an auxiliary / helper column, only containing the Efficiencies where "Yes" was found:我还使用下面的代码来创建一个辅助/帮助列，仅包含找到“是”的效率：

for i, row in df.iterrows():   
    val = row['Reference']
    if val == 'Yes':
        df['check'] = df[df['Reference']=='Yes']['Efficiency']

循环遍历 dataframe 的行，创建一个新列并根据条件将结果存储在另一列中

问题描述

2 个解决方案

解决方案1
1 2022-02-01 13:35:37

解决方案2
0 2022-02-02 07:06:23

循环遍历 dataframe 的行，创建一个新列并根据条件将结果存储在另一列中

问题描述

2 个解决方案

解决方案1 1 2022-02-01 13:35:37

解决方案2 0 2022-02-02 07:06:23

解决方案1
1 2022-02-01 13:35:37

解决方案2
0 2022-02-02 07:06:23