[英]Date Timestamps and datetime64[ns, UTC] comparison in python pandas
由于使用 pandas DataFrames 的过滤过程的影响,我很困惑。 我正在尝试获取介于某些日期之间的行,但我的结果 DataFrame 为空。 我确信那个时期的数据是存在的。
df.info()
返回“opentime”是类型:`opendate 440383 non-null datetime64[ns, UTC]
代码片段:
from datetime import timedelta
from datetime import datetime
current_date = pd.datetime.now()
t_delta_week = timedelta(days=7)
t_delta_year = timedelta(days=365)
#CurrentDate
date_start2020 = pd.Timestamp(current_date - t_delta_week, unit='ms')
date_end2020 = pd.Timestamp(current_date, unit='ms')
date_start2020 = date_start2020.tz_localize('utc')
date_end2020 = date_end2020.tz_localize('utc')
#LastYearDate
date_start2019 = pd.Timestamp(current_date - t_delta_year - t_delta_week, unit='ms')
date_end2019 = pd.Timestamp(current_date - t_delta_year, unit='ms')
date_start2019 = date_start2019.tz_localize('utc')
date_end2019 = date_end2019.tz_localize('utc')
df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opendate'], unit='ms')
mask = (df2020_2019['opendate'] > date_start2020) & (df2020_2019['opendate'] <= date_end2020)
df_currYear = df2020_2019.loc[mask]
df_currYear
返回的 DataFrame 为空
感谢帮助: :)
编辑:
也许这可能会有所帮助:“opendate”是生成列并使用以下代码片段创建:
import pandas as pd
fmt = '%Y-%m-%dT%H:%M:%S'
df2020_2019.dropna(subset=['opentime_TS'], inplace=True)
df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opentime_TS'], utc=True, format=fmt, errors='ignore')
此外,我还放了一些数据样本的head()
打印件。 由于隐私,我无法提供 df 的记录:)
好吧,我的错。 由于专注于tz TypeErrors,我刚刚变得盲目......我采用了已经过时的错误数据源:)适用于正确数据的最终解决方案:
from datetime import timedelta
from datetime import datetime
import pandas as pd
fmt = '%Y-%m-%dT%H:%M:%S'
df2020_2019.dropna(subset=['opentime_TS'], inplace=True)
df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opentime_TS'], utc=True, format=fmt, errors='ignore')
df2020_2019.info()
current_date = pd.Timestamp.now()
t_delta_week = timedelta(days=7)
t_delta_year = timedelta(days=365)
#CurrentDate
date_start2020 = pd.Timestamp(current_date - t_delta_week, unit='ms')
date_end2020 = pd.Timestamp(current_date, unit='ms')
date_start2020 = date_start2020.tz_localize('utc')
date_end2020 = date_end2020.tz_localize('utc')
#LastYearDate
date_start2019 = pd.Timestamp(current_date - t_delta_year - t_delta_week, unit='ms')
date_end2019 = pd.Timestamp(current_date - t_delta_year, unit='ms')
date_start2019 = date_start2019.tz_localize('utc')
date_end2019 = date_end2019.tz_localize('utc')
df2020_2019['opendate'] = pd.to_datetime(df2020_2019['opendate'], unit='ms')
df_currYear = df2020_2019[df2020_2019["opendate"] > date_start2020]
df_lastYear = df2020_2019[df2020_2019["opendate"].between(date_start2019, date_end2019)]
df_currYear
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.