簡體   English   中英

循環遍歷 dataframe 的行,創建一個新列並根據條件將結果存儲在另一列中

[英]Loop through rows of a dataframe, create a new column and store the result based on condition in another column

我有一個 df 如下:

Name    Reference   Efficiency
TargetA    Yes      13
Target_1    No      12
Target_2    No      13
Target_3    No      10
Target_4    No      8
TargetB     Yes     14
Target_4    No      12
Target_5    No      11
Target_6    No     10
TargetC     Yes    15
Target_6    No      11
Target_7    No      13
Target_8    No      12
Target_9    No      14
Target_10   No     10

我想遍歷所有行,只要參考列中有“是”,它將創建另一個名為“檢查”的列,並從 13 中減去效率值(13、12、13、10、8)(這是相應的'Yes'的值。接下來它將從14中減去效率值(14,12,11,10)(這是下一個'Yes'對應的'Yes'值)等等。

預期 output:

Name    Reference   Efficiency  Check

TargetA    Yes            13    0
Target_1    No            12    1
Target_2    No            13    0
Target_3    No            10    3
Target_4    No             8    5
TargetB     Yes           14    0
Target_4    No            12    2
Target_5    No            11    3
Target_6    No            10    4
TargetC     Yes           15    0
Target_6    No            11    4
Target_7    No            13    2
Target_8    No            12    3
Target_9    No            14    1
Target_10   No            10    5

我嘗試了以下代碼:

for i, row in df.iterrows():
    i = 0
    val = row['Reference']
    if val == 'Yes':
        df['check'] = df.loc[i,'Efficiency'] - df['Efficiency'].shift(0)

我得到以下結果:

Name    Reference   Efficiency  Check
0   TargetA     Yes           13    0
1   Target_1    No            12    1
2   Target_2    No            13    0
3   Target_3    No            10    3
4   Target_4    No             8    5
5   TargetB     Yes           14    -1
6   Target_4    No            12    1
7   Target_5    No            11    2
8   Target_6    No            10    3
9   TargetC     Yes           15    -2
10  Target_6    No            11    2
11  Target_7    No            13    0
12  Target_8    No            12    1
13  Target_9    No            14    -1
14  Target_10   No            10    3

我在第一個“是”中得到了正確的結果請有人幫助我

創建一個輔助/輔助列,僅包含找到“是”的效率。 然后通過示例逐步將缺失值替換為之前的有效條目 go:

樣本數據:

import pandas as pd
data = {'Name': {0: 'TargetA',
  1: 'Target_1',
  2: 'Target_2',
  3: 'Target_3',
  4: 'Target_4',
  5: 'TargetB',
  6: 'Target_4',
  7: 'Target_5',
  8: 'Target_6',
  9: 'TargetC',
  10: 'Target_6',
  11: 'Target_7',
  12: 'Target_8',
  13: 'Target_9',
  14: 'Target_10'},
 'Reference': {0: 'Yes',
  1: 'No',
  2: 'No',
  3: 'No',
  4: 'No',
  5: 'Yes',
  6: 'No',
  7: 'No',
  8: 'No',
  9: 'Yes',
  10: 'No',
  11: 'No',
  12: 'No',
  13: 'No',
  14: 'No'},
 'Efficiency': {0: 13,
  1: 12,
  2: 13,
  3: 10,
  4: 8,
  5: 14,
  6: 12,
  7: 11,
  8: 10,
  9: 15,
  10: 11,
  11: 13,
  12: 12,
  13: 14,
  14: 10}}
df = pd.DataFrame(data)

代碼:

mask = df['Reference'].eq('Yes')
df['Check'] = pd.NA
df.loc[mask, 'Check'] = df['Efficiency'].loc[mask].copy()
df['Check'] = df['Check'].ffill()
df['Check'] = df['Check'] - df['Efficiency']

我還使用下面的代碼來創建一個輔助/幫助列,僅包含找到“是”的效率:

for i, row in df.iterrows():   
    val = row['Reference']
    if val == 'Yes':
        df['check'] = df[df['Reference']=='Yes']['Efficiency']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM