[英]to_datetime - max function returns the wrong max date
I have data which it comes from a csv file, and I am trying to get the max date.我有来自 csv 文件的数据,我正在尝试获取最大日期。
Data:数据:
0 01/01/1994
1 01/01/1994
2 01/01/1994
3 01/01/1994
4 01/01/1994
.
.
.
970075 31/08/2021
970076 31/08/2021
970077 31/08/2021
970078 31/08/2021
970079 31/08/2021
However, I get the wrong max value.但是,我得到了错误的最大值。 It seems that my code sets as string my date column, and not as date format, even though I set to_datetime.
似乎我的代码将我的日期列设置为字符串,而不是日期格式,即使我设置了 to_datetime。 Because of that, I use
re
on that string to get the year.因此,我在该字符串上使用
re
来获取年份。
My code:我的代码:
file['Date'] = pd.to_datetime(file['Date'], errors = 'coerce',
dayfirst = True, format = '%d.%m.%Y'
).dt.strftime('%d/%m/%Y')
print(file['Date'].min(), file['Date'].max(), range(int(re.search(r'(\d{4})', file['Date'].min()).group()), int(re.search(r'(\d{4})', file['Date'].max()).group())))
Returns:退货:
01/01/1994 31/12/2020 range(1994, 2020)
I would like to get the max 31/08/2021
and not 31/12/2020
.我想获得最大值
31/08/2021
而不是31/12/2020
。
Remove .dt.strftime
for converting datetimes to strings repr.删除
.dt.strftime
以将日期时间转换为字符串 repr。
.dt.strftime('%d/%m/%Y')
You can convert to custom format after min
and max
.您可以在
min
和max
之后转换为自定义格式。
All together, also simplify for get maximal and minimal years:总之,还要简化以获得最大和最小年份:
file['Date'] = pd.to_datetime(file['Date'], errors = 'coerce', dayfirst = True)
years = file['Date'].dt.year
print(file['Date'].min().strftime('%d/%m/%Y'),
file['Date'].max().strftime('%d/%m/%Y'),
range(years.min(), years.max()))
01/01/1994 31/08/2021 range(1994, 2021)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.