在熊猫python中子集日期时间df

Question

Basic question, but I keep running into issues here. 基本问题，但我在这里一直遇到问题。

I have a df: 我有一个df：

df:
            val
date
2012-01-01  4.2      
2012-01-02  3.7
2012-01-03  6.2
2012-01-04  1.2
2012-01-05  2.4
2012-01-06  2.3
2012-01-08  4.5

As you can see, 2012-01-07 does not exist. 如您所见，2012-01-07不存在。 If I were to write: 如果我要写：

exDate = 20120107
df.ix[str(exDate)]

I get a Key Error. 我收到一个关键错误。

In this case, I want to change my exDate to 20120106 (the max number below 20120107). 在这种情况下，我想将我的exDate更改为20120106（最大数字低于20120107）。 What is the easiest way to check an index to see if a date exists and if the date does not exist, select the next highest below that number (and then return in YYYYmmdd format?) 检查索引以查看日期是否存在以及是否不存在的最简单方法是，选择该数字下方的第二高位（然后以YYYYmmdd格式返回）？

Also, more generally, what is the easiest way to grab the subset of the index with dates below 20120107 for example? 此外，更一般而言，例如，获取日期低于20120107的索引子集的最简单方法是什么？ I seem to be doing well with ranges, but having a hard time selecting above or below a date. 我似乎在范围上做得很好，但是很难选择日期的上方或下方。

Thanks. 谢谢。

Answer 1

To grab the sub-DataFrame with dates below 20120107 you could use: 要获取日期低于20120107的子DataFrame，可以使用：

In [11]: df[:'2012-01-07']
Out[11]: 
            val
date           
2012-01-01  4.2
2012-01-02  3.7
2012-01-03  6.2
2012-01-04  1.2
2012-01-05  2.4
2012-01-06  2.3

To pick the last row using irow : 要使用irow选择最后一行：

In [12]: df[:'2012-01-07'].irow(-1)
Out[12]: 
val    2.3
Name: 2012-01-06

So the last valid date: 所以最后一个有效日期：

In [13]: df[:'2012-01-07'].irow(-1).name
Out[13]: '2012-01-06'

在熊猫python中子集日期时间df

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-02-04 20:53:51

在熊猫python中子集日期时间df

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-02-04 20:53:51

解决方案1
1 已采纳 2013-02-04 20:53:51