简体   繁体   中英

How to change a cell in pandas dataframe according to boolean condition

I have to following dataframe 气候记录 . I successfuly deleted all Feb 29 days from leaping years this dataframe because I intend to groupby "Day of year" column (which was create using.dt.dayofyear) and I decided to ignore the extra day of leaping years. Now, in order to group by "Day of year" column, i have to subsract 1 from days of leaping years if the day is first of March or later. otherwise, the leaping years will have 366 days instead of 355 (even after deleting the leaping days).

Here is my code:

clim_rec = pd.read_csv("daily_climate_records.csv")
clim_rec['Date'] = pd.to_datetime(clim_rec['Date']) # converting "Date" column from string into datetime format

# Let's drop all leaping days by masking all Feb 29 days
feb_29_mask = ~((clim_rec.Date.dt.month == 2) & (clim_rec.Date.dt.day == 29))
clim_rec = clim_rec.where(feb_29_mask).dropna()

# Let's add new column with the "day of year" in order to group by this column
clim_rec['Day of year'] = clim_rec['Date'].dt.dayofyear
print(clim_rec.head())
#print('---------------------------------------------------')
# Now, if the year is a leap year and the dayofyear is greater than the dayofyear of Feb-29
# we subtract 1 from dayofyear. After doing that we will get values 1-365 for dayofyear
leap_year_mask = (clim_rec.Date.dt.year % 4 == 0) & ((clim_rec.Date.dt.year % 100 != 0)
                 |(clim_rec.Date.dt.year % 400 == 0)) & (clim_rec.Date.dt.month >=3)

clim_rec['Day of year'] = clim_rec['Day of year'].apply(lambda x: x-1) # this line is not correct

My question is: How to modify the last line of my attached code in order to apply the substraction only for the specific rows that are true accirding the boolean mask condition

Use DataFrame.loc for select rows by mask, better/ faster is subtract by 1 instead apply for avoid loops (because apply under the hood use loops):

clim_rec.loc[leap_year_mask, 'Day of year'] -= 1 

working like:

clim_rec.loc[leap_year_mask, 'Day of year'] = clim_rec.loc[leap_year_mask, 'Day of year']-1

Would this work for you? Kr.

clim_rec['mask'] = leaf_year_mask
clim_rec['Day of year'] =  clim_rec.apply(lambda x: x['Day of year']-1 if x['mask'] else x['Day of year'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM