简体   繁体   English

numpy 日期时间和熊猫日期时间

[英]numpy datetime and pandas datetime

I'm confused by the interoperation between numpy and pandas date objects (or maybe just by numpy's datetime64 in general).我对 numpy 和 pandas 日期对象之间的互操作感到困惑(或者可能只是 numpy 的 datetime64 一般)。

I was trying to count business days using numpy's built-in functionality like so:我试图使用 numpy 的内置功能来计算工作日,如下所示:

np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64("28/03/2016")])

However, numpy apparently can't deal with the inverted date format:但是,numpy 显然无法处理倒置的日期格式:

ValueError: Error parsing datetime string "28/03/2016" at position 2

To get around this, I thought I'd just use pandas to_datetime, which can.为了解决这个问题,我想我只需要使用pandas to_datetime,就可以了。 However:然而:

np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64(pd.to_datetime("28/03/2016"))])

ValueError: Cannot safely convert provided holidays input into an array of dates

Searching around for a bit, it seemed that this was caused by the fact that the chaining of to_datetime and np.datetime64 results in a datetime64[us] object, which apparently the busday_count function cannot accept (is this intended behaviour or a bug?).搜索了一下,这似乎是由于 to_datetime 和 np.datetime64 的链接导致datetime64[us]对象,显然busday_count函数不能接受(这是预期的行为还是错误?) . Thus, my next attempt was:因此,我的下一次尝试是:

np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64(pd.Timestamp("28"), "D")])

But:但:

TypeError: Cannot cast datetime.datetime object from metadata [us] to [D] according to the rule 'same_kind'

And that's me out - why are there so many incompatibilities between all these datetime formats?这就是我 - 为什么所有这些日期时间格式之间有这么多的不兼容? And how can I get around them?我怎样才能绕过它们?

I've been having a similar issue, using np.is_busday()我一直有类似的问题,使用 np.is_busday()

The type of datetime64 is vital to get right. datetime64 的类型对于正确处理至关重要。 Checking the numpy datetime docs, you can specify the numpy datetime type to be D.检查 numpy datetime 文档,您可以将 numpy datetime 类型指定为 D。

This works:这有效:

my_holidays=np.array([datetime.datetime.strptime(x,'%m/%d/%y') for x in holidays.Date.values], dtype='datetime64[D]')

day_flags['business_day'] = np.is_busday(days,holidays=my_holidays)

Whereas this throws the same error you got:而这会引发与您相同的错误:

my_holidays=np.array([datetime.datetime.strptime(x,'%m/%d/%y') for x in holidays.Date.values], dtype='datetime64')

The only difference is specifying the type of datetime64.唯一的区别是指定 datetime64 的类型。

dtype='datetime64[D]'

vs对比

dtype='datetime64'

Docs are here:文档在这里:

https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.datetime.html https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.datetime.html

I had the same issue while using np.busday_count , later I figured out the problem was with the hours , minutes , seconds , and milliseconds getting added while converting it to datetime object or numpy datetime object.我在使用np.busday_count遇到了同样的问题,后来我发现问题是在将其转换为datetime对象或numpy datetime对象时添加了hoursminutessecondsmilliseconds

I just converted to datetime object with only date and not the hours , minutes , seconds , and milliseconds .我只是转换为只有日期而不是hoursminutessecondsmilliseconds datetime对象。

The following was my code:以下是我的代码:

holidays_list.json file: holidays_list.json文件:

{
    "holidays_2019": [
        "04-Mar-2019",
        "21-Mar-2019",
        "17-Apr-2019",
        "19-Apr-2019",
        "29-Apr-2019",
        "01-May-2019",
        "05-Jun-2019",
        "12-Aug-2019",
        "15-Aug-2019",
        "02-Sep-2019",
        "10-Sep-2019",
        "02-Oct-2019",
        "08-Oct-2019",
        "28-Oct-2019",
        "12-Nov-2019",
        "25-Dec-2019"
    ],
    "format": "%d-%b-%Y"
}

code file : code file

import json
import datetime
import numpy as np

with open('holidays_list.json', 'r') as infile:
    data = json.loads(infile.read())

# the following is where I convert the datetime object to date
holidays = list(map(lambda x: datetime.datetime.strptime(
    x, data['format']).date(), data['holidays_2019']))

start_date = datetime.datetime.today().date()
end_date = start_date + datetime.timedelta(days=30)
holidays = [start_date + datetime.timedelta(days=1)]
print(np.busday_count(start_date, end_date, holidays=holidays))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM