简体   繁体   English

使用混合日期格式处理Pandas数据帧列

[英]Handling Pandas dataframe columns with mixed date formats

I am have imported a CSV file which has mixed data formats - some date formats recognized by read_csv, plus some Excel serial-datetime format (eg 41,866.321). 我已经导入了一个具有混合数据格式的CSV文件 - 一些由read_csv识别的日期格式,以及一些Excel serial-datetime格式(例如41,866.321)。

Once the data is imported, the column type is shown as object (given the different types of data) and the dates (both types of formats) have dtype string. 导入数据后,列类型显示为对象(给定不同类型的数据),日期(两种格式)都具有dtype字符串。

I would like to use the to_datetime method to convert the recognized string date formats into datetimes in the dataframe column, leaving the unrecognized strings in excel format which I can then isolate and correct off line. 我想使用to_datetime方法将识别的字符串日期格式转换为数据帧列中的日期时间,将未识别的字符串保留为excel格式,然后我可以隔离并离线校正。 But unless I apply the method row by row (way too slow), it fails to do this. 但除非我逐行应用该方法(方式太慢),否则无法执行此操作。

Does anyone have a cleverer way of solving this? 有没有人有更聪明的方法来解决这个问题?

Update: having tinkered around some more I have found this solution, using coerce = True to force the column datatype conversion, and then identifying null values which I can cross reference back to the original file. 更新:我已经找到了更多的解决方案,我找到了这个解决方案,使用coerce = True强制列数据类型转换,然后识别空值,我可以交叉引用回原始文件。 But if there is a better way to do this (eg fixing the unrecognized time stamps in place) please let me know. 但如果有更好的方法(例如修复无法识别的时间戳),请告诉我。

df1['DateTime']=pd.to_datetime(df1['Time_Date'],coerce=True)
nulls=df1['Time_Date'][df1['Time_Date'].notnull()==False]

Having tinkered around some more I have found this solution, using coerce = True to force the column datatype conversion, and then identifying null values which I can cross reference back to the original file. 我已经找到了更多的解决方案,我找到了这个解决方案,使用coerce = True强制列数据类型转换,然后识别空值,我可以交叉引用回原始文件。 But if there is a better way to do this (eg fixing the unrecognized time stamps in place) please let me know. 但如果有更好的方法(例如修复无法识别的时间戳),请告诉我。

df1['DateTime']=pd.to_datetime(df1['Time_Date'], errors='coerce')
nulls=df1['Time_Date'][df1['Time_Date'].notnull()==False]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM