[英]Why does date_range give a result different from indexing [] for DataFrame Pandas dates?
Here is a simple code with date_range
and indexing [ ] I used with Pandas这是一个带有date_range
和索引 [ ] 的简单代码,我与 Pandas 一起使用
period_start = '2013-01-01'
period_end = '2019-12-24'
print(pd.DataFrame ({'close':aapl_close,
'returns':aapl_returns},index=pd.date_range(start=period_start,periods=6)))
print(pd.DataFrame ({'close':aapl_close,
'returns':aapl_returns})[period_start:'20130110'])
date_range
gives Nan results date_range
给出 Nan 结果
close returns
2013-01-01 NaN NaN
2013-01-02 NaN NaN
2013-01-03 NaN NaN
2013-01-04 NaN NaN
Indexing gives correct results索引给出正确的结果
close returns
2013-01-02 00:00:00+00:00 68.732 0.028322
2013-01-03 00:00:00+00:00 68.032 -0.010184
2013-01-04 00:00:00+00:00 66.091 -0.028531
Based on how the dates are shown by date_range
- I suppose the date format of date_range
does not match the date format in the Pandas DataFrame.根据date_range
显示日期的方式——我想 date_range 的日期格式与 Pandas date_range
中的日期格式不匹配。
1) Can you explaine please why it gives NaN? 1) 你能解释一下为什么它给出 NaN 吗?
2) What would you suggest to get a specific time range from the Panda DataFrame? 2) 从 Panda DataFrame 中获取特定时间范围的建议是什么?
As I'm a beginner in Python and its libraries, I didn't understand that this question refers to the Quantopian library, not to Pandas.由于我是 Python 及其库的初学者,我不明白这个问题是指 Quantopian 库,而不是 Pandas。
I got a solution on their forum.我在他们的论坛上找到了解决方案。 All the times returned by methods on Quantopian are timezone aware with a timezone of 'UTC'. Quantopian 上的方法返回的所有时间都是时区感知的,时区为“UTC”。 By default, the date_range method returns timezone naive dates.默认情况下,date_range 方法返回时区原始日期。 Simply supply the timezone information to date_range method.只需将时区信息提供给 date_range 方法。 Like this像这样
pd.DataFrame ({
'close':aapl_close,
'returns':aapl_returns,},
index=pd.date_range(start=period_start, periods=6, tz='UTC'))
To get a specific date or time range in pandas perhaps the easiest is simple bracket notation.要在 pandas 中获取特定的日期或时间范围,也许最简单的方法就是简单的括号表示法。 For example, to get dates between 2013-01-04 and 2013-01-08 (inclusive) simply enter this:例如,要获取 2013-01-04 和 2013-01-08(含)之间的日期,只需输入以下内容:
df = pd.DataFrame ({'close':aapl_close, 'returns':aapl_returns,})
my_selected_dates = df['2013-01-04':'2013-01-08']
This bracket notation is really shorthand for using the loc method这个括号符号实际上是使用 loc 方法的简写
my_selected_dates = df.loc['2013-01-04':'2013-01-08']
Both work the same but the loc method has a bit more flexibility.两者的工作原理相同,但 loc 方法具有更大的灵活性。 This notation also works with datetimes if desired.如果需要,此表示法也适用于日期时间。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.