简体   繁体   中英

Remove values from specific columns with a condition in Python

I would like to remove values in the value1, value2, value3 and value4 columns if the 'on status' column contains the string 'new'

Data

id  date        location    on status   value1  value2  value3  value 4
CC  1/1/2022    ny          new         12      1       0       1
CC  4/1/2022    ny          new         1       1       8       9
CC  7/1/2022    ny          new         1       1       1       0
CC  10/1/2022   ny          new         1       2       2       1
CC  1/1/2023    ny          ok          1       2       2       1

Desired

id  date        location    on status   value1  value2  value3  value4
CC  1/1/2022    ny          new         
CC  4/1/2022    ny          new         
CC  7/1/2022    ny          new         
CC  10/1/2022   ny          new         
CC  1/1/2023    ny          ok          1       2      2         1
    

This only works on the first 2 columns, but it actually adds two additional columns (value3 and value4 and deletes the data from all rows not just the conditional 'new' Any suggestion is appreciated

Doing

df.loc[(df['on status'] == 'new'),  ['value1', 'value2','value3', 'value4']]= ''

sample input

df = pd.DataFrame({
    
    "on_status" : ["new", "new", "new", "new", "ok"],
    "value1" : [x for x in range(5)],
    "value2" : [x for x in range(5)],
    "value3" : [x for x in range(5)],
    "value4" : [x for x in range(5)],
})

it does work for me using loc

df.loc[(df.on_status == "new"), ["value1", "value2", "value3", "value4"]] = ''

sample output

on_status   value1  value2  value3  value4
0   new             
1   new             
2   new             
3   new             
4   ok  4   4   4   4
df.loc[(df['status'].str.strip() == 'new'),  ['value1', 'value2','value3', 'value4']]= ''

try this as mentioned Naveed, if there are whitespaces around then it will check by stripping white spaces.

for i in range(len(df.index)):
    if df.loc[i,'on status'] == 'new':
        df.loc[i,'value1': 'value 4'] = ' '
df    

here is one way to do it

df.loc[(df['on status'] == 'new'), ['value1', 'value2','value3', 'value4']] = ''

OR

df.loc[(df['on status'].str.strip() == 'new'), ['value1', 'value2','value3', 'value4']] = ''

df
    id  date    location    on status   value1  value2  value3  value4
0   CC  1/1/2022    ny      new                 
1   CC  4/1/2022    ny      new                 
2   CC  7/1/2022    ny      new                 
3   CC  10/1/2022   ny      new                 
4   CC  1/1/2023    ny       ok             1      2       2       1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM