简体   繁体   English

pandas 中不一致的日期格式计数

[英]Count of inconsistent dates formats in pandas

I have a column of type object it contains 500 rows of dates.我有一列object类型的列,它包含 500 行日期。 I converted the column type to date and I am trying to get a count of the incorrect values, in order to fix them.我将列类型转换为date ,我试图计算不正确的值,以修复它们。

Sample of the column, you can see examples of the wrong values in rows: 3 and 5列的示例,您可以在行中看到错误值的示例:3 和 5

0      2018-06-14
1      2018-11-12
2      2018-10-09
3      2018-24-08
4      2018-11-12
5      11-02-2018
6      2018-12-31

I can fix the dates if I use this code:如果我使用此代码,我可以修复日期:

dirtyData['date'] = pd.to_datetime(dirtyData['date'],dayfirst=True)

But I would like to check that the format in every row is %Y-%m-%d' and get the count of the inconsistent formats first.但我想检查每一行中的格式是否为%Y-%m-%d'并首先获取不一致格式的计数。 Then change the values.然后更改值。

Is it possible to achieve this?有可能实现这一目标吗?

The below code will work.下面的代码将起作用。 However, as Michael Gardner mentioned it wont distinguish between days and months if the day 12 or less但是,正如迈克尔·加德纳(Michael Gardner)所说,如果第 12 天或更少,它不会区分天数和月份

import datetime
import pandas as pd

date_list = ["2018-06-14", "2018-11-12", "2018-10-09", "2018-24-08",
"2018-11-12", "11-02-2018", "2018-12-31"]

series1 = pd.Series(date_list)
print(series1)
#The above code is to replicate your date series


count = 0
for item in series1:
    try:
        datetime.datetime.strptime(item, "%Y-%m-%d") #checks if the date format is Year, Month,Day.
    except ValueError: #if there is a value error then it will count these errors
        count += 1 

print(count)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM