简体   繁体   English

为什么 date_range 给出的结果与 DataFrame Pandas 日期的索引 [] 不同?

[英]Why does date_range give a result different from indexing [] for DataFrame Pandas dates?

Here is a simple code with date_range and indexing [ ] I used with Pandas这是一个带有date_range和索引 [ ] 的简单代码,我与 Pandas 一起使用

period_start = '2013-01-01'
period_end = '2019-12-24'

print(pd.DataFrame ({'close':aapl_close,
        'returns':aapl_returns},index=pd.date_range(start=period_start,periods=6)))

print(pd.DataFrame ({'close':aapl_close,
            'returns':aapl_returns})[period_start:'20130110'])

date_range gives Nan results date_range给出 Nan 结果

            close  returns
2013-01-01    NaN      NaN
2013-01-02    NaN      NaN
2013-01-03    NaN      NaN
2013-01-04    NaN      NaN

Indexing gives correct results索引给出正确的结果

                            close   returns
2013-01-02 00:00:00+00:00  68.732  0.028322
2013-01-03 00:00:00+00:00  68.032 -0.010184
2013-01-04 00:00:00+00:00  66.091 -0.028531

Based on how the dates are shown by date_range - I suppose the date format of date_range does not match the date format in the Pandas DataFrame.根据date_range显示日期的方式——我想 date_range 的日期格式与 Pandas date_range中的日期格式不匹配。

1) Can you explaine please why it gives NaN? 1) 你能解释一下为什么它给出 NaN 吗?

2) What would you suggest to get a specific time range from the Panda DataFrame? 2) 从 Panda DataFrame 中获取特定时间范围的建议是什么?

As I'm a beginner in Python and its libraries, I didn't understand that this question refers to the Quantopian library, not to Pandas.由于我是 Python 及其库的初学者,我不明白这个问题是指 Quantopian 库,而不是 Pandas。

I got a solution on their forum.我在他们的论坛上找到了解决方案。 All the times returned by methods on Quantopian are timezone aware with a timezone of 'UTC'. Quantopian 上的方法返回的所有时间都是时区感知的,时区为“UTC”。 By default, the date_range method returns timezone naive dates.默认情况下,date_range 方法返回时区原始日期。 Simply supply the timezone information to date_range method.只需将时区信息提供给 date_range 方法。 Like this像这样

pd.DataFrame ({  
'close':aapl_close,  
'returns':aapl_returns,},  
index=pd.date_range(start=period_start, periods=6, tz='UTC'))

To get a specific date or time range in pandas perhaps the easiest is simple bracket notation.要在 pandas 中获取特定的日期或时间范围,也许最简单的方法就是简单的括号表示法。 For example, to get dates between 2013-01-04 and 2013-01-08 (inclusive) simply enter this:例如,要获取 2013-01-04 和 2013-01-08(含)之间的日期,只需输入以下内容:

df = pd.DataFrame ({'close':aapl_close,  'returns':aapl_returns,})  
my_selected_dates = df['2013-01-04':'2013-01-08']

This bracket notation is really shorthand for using the loc method这个括号符号实际上是使用 loc 方法的简写

my_selected_dates = df.loc['2013-01-04':'2013-01-08']

Both work the same but the loc method has a bit more flexibility.两者的工作原理相同,但 loc 方法具有更大的灵活性。 This notation also works with datetimes if desired.如果需要,此表示法也适用于日期时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM