I have a dataframe with some dates in a column. I would like to set the year to 2021 if the month is January, as I have some errors in the Data I am processing with people putting January 2020.
Port Of Loading ETA Destination Port
2 Qingdao 2020-01-09 00:00:00
3 Qingdao 2020-01-16 00:00:00
4 Shenzhen 2020-12-31 00:00:00
Would become:
Port Of Loading ETA Destination Port
2 Qingdao 2021-01-09 00:00:00
3 Qingdao 2021-01-16 00:00:00
4 Shenzhen 2020-12-31 00:00:00
I tried with:
if df[df['ETA Destination Port']].month == 1:
januarys = df[df['ETA Destination Port']].month == 1
januarys.year = 2021
df = np.where(df['ETA Destination Port'].month == 1, df['ETA Destination Port'], januarys)
But I get the error:
KeyError: "None of [DatetimeIndex(['2020-11-19', '2020-12-03', '2020-12-10'], dtype='datetime64[ns]', freq=None)] are in the [columns]"
Any help appreciated:)
You can try the below code:
import pandas as pd
csvfile = pd.read_csv("input.csv")
# Extract dates in separated columns
csvfile['Day'] = pd.to_datetime(csvfile['ETA Destination Port']).dt.day
csvfile['Month'] = pd.to_datetime(csvfile['ETA Destination Port']).dt.month
csvfile['Year'] = pd.to_datetime(csvfile['ETA Destination Port']).dt.year
# Change year to 2021 when month is January
csvfile.loc[csvfile['Month'] == 1, 'Year'] = 2021
# Concatenate values into a single column and drop irrelevant
### If you want to display time as well as date
# csvfile['ETA Destination Port'] = pd.to_datetime(csvfile[['Year', 'Month', 'Day']]).dt.strftime('%Y-%m-%d %H:%M:%S')
### If you want to keep it as a datetime format
csvfile['ETA Destination Port'] = pd.to_datetime(csvfile[['Year', 'Month', 'Day']])
csvfile = csvfile.drop(columns=['Day', 'Month', 'Year'])
Probably not the most efficient way to proceed as I am not a pandas master but should do the trick.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.