简体   繁体   中英

Replace one column's values with NaN based on date conditions in Pandas

I try to replace/update price column's values based on condition of: if date is equal to 2019-09-01 , then replace or update them with with np.nan , I use two methods but not worked out so far:

         price     pct      date
0  10379.00000  0.0242  2019/6/1
1  10608.25214     NaN  2019/9/1
2  10400.00000  0.0658  2019/6/1
3  10258.48471     NaN  2019/9/1
4  12294.00000  0.1633  2019/6/1
5  11635.07402     NaN  2019/9/1
6  12564.00000 -0.0066  2019/6/1
7  13615.10992     NaN  2019/9/1

Solution 1: df.price.where(df.date == '2019-09-01', np.nan, inplace=True) , but it replaced all price values with NaN

   price     pct        date
0    NaN  0.0242  2019-06-01
1    NaN     NaN  2019-09-01
2    NaN  0.0658  2019-06-01
3    NaN     NaN  2019-09-01
4    NaN  0.1633  2019-06-01
5    NaN     NaN  2019-09-01
6    NaN -0.0066  2019-06-01
7    NaN     NaN  2019-09-01

Solution 2: df.loc[df.date == '2019-09-01', 'price'] = np.nan , this didn't replace values.

         price     pct        date
0  10379.00000  0.0242  2019-06-01
1  10608.25214     NaN  2019-09-01
2  10400.00000  0.0658  2019-06-01
3  10258.48471     NaN  2019-09-01
4  12294.00000  0.1633  2019-06-01
5  11635.07402     NaN  2019-09-01
6  12564.00000 -0.0066  2019-06-01
7  13615.10992     NaN  2019-09-01

Please note date in excel file before read_excel is 2019/9/1 format, I have converted it with df['date'] = pd.to_datetime(df['date']).dt.date .

Someone why this doesn't work? Thanks.

'2019-06-01' is a string, df.date is a datetime

you should convert df.date to str to match

df.loc[df.date.astype(str) == '2019-06-01', 'price'] = np.nan

Actually the first solution works (kind of) for me, try this:

import pandas as pd
import numpy as np
df = pd.DataFrame(
    np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [3, 2, 1], [5, 6, 7]]), 
    columns=['a', 'b', 'c']
)

The df should be:

   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9
3  3  2  1
4  5  6  7

Then using the similiar code:

df.a.where(df.c != 7, np.nan, inplace=True)

I got the df as:

     a  b  c
0  1.0  2  3
1  4.0  5  6
2  7.0  8  9
3  3.0  2  1
4  NaN  6  7

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM