[英]Python Pandas.Series.asof: Cannot compare type 'Timestamp' with type 'struct_time'
[英]Cannot compare type 'Timestamp' with type 'str' Pandas Python
我有兩個帶有日期時間的數據框:
df["datetime"] = df[["date","time"]].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
df["datetime"] = pd.to_datetime(df["datetime"], format='%Y-%m-%d %H:%M:%S')
對於另一個:
df_labels.columns = ["start_date","start_time","end_date","end_time","mode"]
df_labels["start_datetime"] = df_labels[["start_date","start_time"]].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
df_labels["end_datetime"] = df_labels[["end_date","end_time"]].apply(lambda row: ' '.join(row.values.astype(str)), axis=1)
df_labels["start_datetime"] = df_labels["start_datetime"].str.replace("/","-")
df_labels["end_datetime"] = df_labels["end_datetime"].str.replace("/","-")
df_labels["start_datetime"] = pd.to_datetime(df_labels["start_datetime"], format='%Y-%m-%d %H:%M:%S')
df_labels["end_datetime"] = pd.to_datetime(df_labels["end_datetime"], format='%Y-%m-%d %H:%M:%S')
上述所有代碼均成功運行。
df 示例:
lat long u1 alt d date time datetime mode
0 39.921712 116.472343 0 13 39298.146204 2007-08-04 03:30:32 2007-08-04 03:30:32
1 39.921705 116.472343 0 13 39298.146215 2007-08-04 03:30:33 2007-08-04 03:30:33
2 39.921695 116.472345 0 13 39298.146227 2007-08-04 03:30:34 2007-08-04 03:30:34
3 39.921683 116.472342 0 13 39298.146238 2007-08-04 03:30:35 2007-08-04 03:30:35
4 39.921672 116.472342 0 13 39298.146250 2007-08-04 03:30:36 2007-08-04 03:30:36
df_labels 示例:
start_date start_time end_date end_time mode start_datetime end_datetime
0 2007/06/26 11:32:29 2007/06/26 11:40:29 bus 2007-06-26 11:32:29 2007-06-26 11:40:29
1 2008/03/28 14:52:54 2008/03/28 15:59:59 train 2008-03-28 14:52:54 2008-03-28 15:59:59
2 2008/03/28 16:00:00 2008/03/28 22:02:00 train 2008-03-28 16:00:00 2008-03-28 22:02:00
3 2008/03/29 01:27:50 2008/03/29 15:59:59 train 2008-03-29 01:27:50 2008-03-29 15:59:59
4 2008/03/29 16:00:00 2008/03/30 15:59:59 train 2008-03-29 16:00:00 2008-03-30 15:59:59
但是,當我運行這個時:
for index, row in df_labels.iterrows():
df.loc[(df["datetime"] >= row["start_datetime"]) & (df["datetime"] < row["end_datetime"])] = row["mode"]
我收到以下錯誤:
TypeError: Cannot compare type 'Timestamp' with type 'str'
請指教
考慮:日期時間值采用這種dd/mm/yy hh:mm:ss
格式。
df['datetime'] = pd.to_datetime(df['datetime'], format='%d/%m/%y %H:%M:%S')
df_labels["start_datetime"] = pd.to_datetime(df_labels["start_datetime"], format='%d/%m/%y %H:%M:%S')
df_labels["end_datetime"] = pd.to_datetime(df_labels["end_datetime"], format='%%d/%m/%y %H:%M:%S')
確保數據 dtypes:
df.dtypes
df_label.dtypes
正確轉換后,日期時間列應顯示datetime64[ns]
附加(效率):
import numpy as np
import pandas as pd
import pandasql as ps
from pandas import Timestamp
from pandasql import sqldf
import sqlite3
conn = sqlite3.connect(':memory:')
##### write the tables
df.to_sql('df', conn, index=False)
df_label.to_sql('df', conn, index=False)
qry = '''
select *
from df
inner join
(select mode df_label_mode, start_date, end_date from df_label) df_label
on (df.datetime between df_label.start_date and df_label.end_date)
'''
df_x = pd.read_sql_query(qry, conn)
df_x.head()
參考: 轉換日期列
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.