简体   繁体   中英

Replace string value with previous row value based on conditionals - Pandas

I am trying to replace the value in the current row based on the previous row given that certain conditions are met.

Conditions:

Current row is 0

Previous row is C

Within Group (preferred, but will likely work without)

Example dataframe similar to mine:

ID  Week value
 4    1     W
 4    2     C
 4    3     0
 4    4     0
24    1     W
24    2     W
24    3     0
24    4     A

Example of what I need it to look like:

ID  Week value
 4    1     W
 4    2     C
 4    3     C
 4    4     C
24    1     W
24    2     W
24    3     0
24    4     A

Questions by others that I cant seem to rework or doesn't quite fit my problem:

  1. conditional replace based off prior value in same column of pandas dataframe python
  2. conditional change of a pandas row, with the previous row value

Code to build dataframe similar to mine

import pandas as pd

df = pd.DataFrame({'ID': {0:'4', 1:'4', 2:'4', 3:'4', 4:'24', 5:'24', 6:'24', 7:'24'}, 'Week': {0:'1', 1:'2', 2:'3', 3:'4', 4: '1', 5:'2', 6:'3', 7:'4'},  'value': {0:'W', 1:'C', 2:'0', 3:'0', 4: 'W', 5:'W', 6:'0', 7:'A'} })
df[['ID', 'Week']] = df[['ID', 'Week']].astype('int')

Poorly worked attempt to solve the problem (throws errors)

for i in range(1, len(df)):
    if df.value[i] == '0' and df.value[i-1] == 'C':
         df.value[i] = 'C'
     else:
         df.value[i] = df.value[i]

Usually, I would use np.where to apply a conditional to a column. However, given the .shift() function, this doesn't work without throwing it into a for loop. A quick method is using .replace() :

for row in range(0,len(df)):
    df['value'] = df['value'].replace('0',df['value'].shift(1))

If you wish to maintain conditional, you could still utilize np.where in a similar fashion.

for row in range(0,len(df)):
    df['value'] = np.where((df['value'] == '0') & (df['value'].shift(1) == 'C'), 'C', df['value'])

Not easy to generalize to other situations but for your specific case you can do:

is_0 = df['value'] == '0'
is_C_block = df['value'].replace('0', pd.np.nan).fillna(method='ffill') == 'C'

df.loc[is_0 & is_C_block, 'value'] = 'C'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM