简体   繁体   中英

Change cell value with condition

I have a dataframe:

df = pd.DataFrame(
        {'a': ['banana', 'coconut', 'banana', 'apple'],
         'b': ['rice', 'bean', 'rice', 'soap'],
         'c': ['mouse', 'dog', None,'apple'],
         'd': ['cat', 'soap', 'beef', 'rabbit']}
    )


         a     b     c       d
0   banana  rice   mouse  cat
1  coconut  bean   dog    soap
2   banana  rice  None    cat
3    apple  soap  apple   rabbit

If a line contains the value None (here in index 2), we look for the line whose values are exactly the same and change the value of None by that of the same column. So here the row of index 0 and the row of index 2 have the same values except in column 'c'. We then change None by 'cat' The expected result is therefore:

         a     b     c       d
0   banana  rice   mouse   cat
1  coconut  bean   dog     soap
2   banana  rice   mouse   cat
3    apple  soap   apple   rabbit

Quelqu'un à une solution à cette probleme, merci

df.loc[df['c'].isnull(), 'c'] = df[df.duplicated(subset = ['a', 'b'], keep = 'last')]['c'].values

df

Output:

|index|    a    | b  |  c  |  d   |
|-----|---------|----|-----|------|
|  0  | banana  |rice|mouse| cat  |
|  1  | coconut |bean| dog | soap |
|  2  | banana  |rice|mouse| beef |
|  3  | apple   |soap|apple|rabbit|

This code would do the trick for any number of None s:

In [183]: df = pd.DataFrame(
     ...:         {'a': ['banana', 'coconut', 'banana', 'apple', None],
     ...:          'b': ['rice', 'bean', 'rice', 'soap', 'soap'],
     ...:          'c': ['mouse', 'dog', None, 'apple', 'apple'],
     ...:          'd': ['cat', 'soap', 'cat', 'rabbit', None]}
     ...:     )

In [184]: df
Out[184]: 
         a     b      c       d
0   banana  rice  mouse     cat
1  coconut  bean    dog    soap
2   banana  rice   None     cat
3    apple  soap  apple  rabbit
4     None  soap  apple    None

In [185]: rows = df.isnull().any(axis=1).to_numpy().nonzero()[0] # rows with None
     ...: for i in rows:
     ...:     row = df.iloc[i]
     ...:     cols = df.columns[row.notnull()] # columns without None
     ...:     replacement = (df[cols] == row[cols]).all(axis=1).to_numpy().nonzero()[0]
     ...:     for j in replacement:
     ...:         if i != j:
     ...:             df.loc[i] = df.loc[j]
     ...:             break

In [186]: df
Out[186]: 
         a     b      c       d
0   banana  rice  mouse     cat
1  coconut  bean    dog    soap
2   banana  rice  mouse     cat
3    apple  soap  apple  rabbit
4    apple  soap  apple  rabbit

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM