简体   繁体   English

熊猫:修复datetime.time和datetime.datetime混合

[英]Pandas: Fixing datetime.time and datetime.datetime mix

I have the following DataFrame, with 'Time' column with mixed datetime types: 我有以下DataFrame,其中“ Time”列的日期时间类型混合:

time_series_slice = tmp_df['XXX']
time_series_slice['Time types'] = time_series_slice['Time'].apply(lambda row: type(row))
time_series_slice['Time types'].value_counts()

<class 'datetime.datetime'>    97367
<class 'datetime.time'>           25
Name: Time types, dtype: int64

I am having a problem converting this whole 'Time' column to Pandas datetime with pd.to_datetime() method due to: 我在使用pd.to_datetime()方法将整个“时间”列转换为Pandas datetime时遇到问题,原因是:

TypeError: <class 'datetime.time'> is not convertible to datetime

Approach time_series_slice['Time'].apply(lambda x: pd.Timestamp(x)) also does not work: 方法time_series_slice ['Time']。apply(lambda x:pd.Timestamp(x))也无效:

TypeError: Cannot convert input [00:00:00] of type <class 'datetime.time'> to Timestamp

I figures that these 25 stupid rows with are giving me this headache, but I lack ideas on what to do with them. 我认为这25个愚蠢的行让我头疼,但我缺乏如何处理它们的想法。

Firstly, how do I force Pandas to display only these rows? 首先,如何强制熊猫只显示这些行? time_series_slice[isinstance(time_series_slice['Time'], datetime.time)] gives me: time_series_slice [isinstance(time_series_slice ['Time'],datetime.time)]给我:

NameError: name 'datetime' is not defined

Secondly, how do I just convert all these values to Pandas datetime and move on? 其次,如何将所有这些值转换为Pandas日期时间并继续前进? :( :(

UPDATE: 更新:

Adding sample data view: 添加样本数据视图:

0    2017-02-08 22:19:08.618000
1    2017-02-08 22:19:12.187000
2    2017-02-08 22:19:13.481000
3    2017-02-08 22:19:16.330000
4    2017-02-08 22:19:16.582000
Name: Time, dtype: object

UPDATE 2: Thanks to Wen-Ben's suggestion, I have filtered out the datetime.time rows, and they look as such: 更新2:由于Wen-Ben的建议,我过滤掉了datetime.time行,它们看起来像这样:

time_series_slice['Time types'] = time_series_slice['Time'].apply(lambda row: type(row).__name__)
time_series_slice[time_series_slice['Time types'] == 'time']['Time']

96367    00:00:00
96368    00:00:00
96464    00:00:00
96465    00:00:00
96466    00:00:00
96467    00:00:00
96593    00:00:00
96862    00:00:00
Name: Time, dtype: object

Would the easiest way be to re-write them to a datetime.datetime object with all 0s? 最简单的方法是将它们重写为全0的datetime.datetime对象吗?

If you want to slice the those 5 rows 如果你想切片那5行

time_series_slice['Time types'] = time_series_slice['Time'].apply(lambda x : type(x).__name__)=='Timestamp'

time_series_slice['Time types'].value_counts()

time_series_slice[time_series_slice['Time types']=='datetime.time']

Then 然后

We using to_datetime to convert 我们使用to_datetime进行转换

time_series_slice['Time']=pd.to_datetime(time_series_slice['Time'].astype(str))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM