[英]Pandas: Fixing datetime.time and datetime.datetime mix
I have the following DataFrame, with 'Time' column with mixed datetime types: 我有以下DataFrame,其中“ Time”列的日期时间类型混合:
time_series_slice = tmp_df['XXX']
time_series_slice['Time types'] = time_series_slice['Time'].apply(lambda row: type(row))
time_series_slice['Time types'].value_counts()
<class 'datetime.datetime'> 97367
<class 'datetime.time'> 25
Name: Time types, dtype: int64
I am having a problem converting this whole 'Time' column to Pandas datetime with pd.to_datetime() method due to: 我在使用pd.to_datetime()方法将整个“时间”列转换为Pandas datetime时遇到问题,原因是:
TypeError: <class 'datetime.time'> is not convertible to datetime
Approach time_series_slice['Time'].apply(lambda x: pd.Timestamp(x)) also does not work: 方法time_series_slice ['Time']。apply(lambda x:pd.Timestamp(x))也无效:
TypeError: Cannot convert input [00:00:00] of type <class 'datetime.time'> to Timestamp
I figures that these 25 stupid rows with are giving me this headache, but I lack ideas on what to do with them. 我认为这25个愚蠢的行让我头疼,但我缺乏如何处理它们的想法。
Firstly, how do I force Pandas to display only these rows? 首先,如何强制熊猫只显示这些行? time_series_slice[isinstance(time_series_slice['Time'], datetime.time)] gives me: time_series_slice [isinstance(time_series_slice ['Time'],datetime.time)]给我:
NameError: name 'datetime' is not defined
Secondly, how do I just convert all these values to Pandas datetime and move on? 其次,如何将所有这些值转换为Pandas日期时间并继续前进? :( :(
UPDATE: 更新:
Adding sample data view: 添加样本数据视图:
0 2017-02-08 22:19:08.618000
1 2017-02-08 22:19:12.187000
2 2017-02-08 22:19:13.481000
3 2017-02-08 22:19:16.330000
4 2017-02-08 22:19:16.582000
Name: Time, dtype: object
UPDATE 2: Thanks to Wen-Ben's suggestion, I have filtered out the datetime.time rows, and they look as such: 更新2:由于Wen-Ben的建议,我过滤掉了datetime.time行,它们看起来像这样:
time_series_slice['Time types'] = time_series_slice['Time'].apply(lambda row: type(row).__name__)
time_series_slice[time_series_slice['Time types'] == 'time']['Time']
96367 00:00:00
96368 00:00:00
96464 00:00:00
96465 00:00:00
96466 00:00:00
96467 00:00:00
96593 00:00:00
96862 00:00:00
Name: Time, dtype: object
Would the easiest way be to re-write them to a datetime.datetime object with all 0s? 最简单的方法是将它们重写为全0的datetime.datetime对象吗?
If you want to slice the those 5 rows 如果你想切片那5行
time_series_slice['Time types'] = time_series_slice['Time'].apply(lambda x : type(x).__name__)=='Timestamp'
time_series_slice['Time types'].value_counts()
time_series_slice[time_series_slice['Time types']=='datetime.time']
Then 然后
We using to_datetime
to convert 我们使用to_datetime
进行转换
time_series_slice['Time']=pd.to_datetime(time_series_slice['Time'].astype(str))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.