简体   繁体   English

获取匹配日期的DataFrame行

[英]get DataFrame rows matching dates

Let's say I have the following DataFrame: 假设我有以下DataFrame:

df = pd.DataFrame({'item': ['Subway', 'Pasta', 'Chipotle'],
                   'cost': [10, 5, 9],
                   'date': ['2017-12-01', '2017-11-01', '2017-10-01']})
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d')

I'm able to get all items in 2017-10 (only one item in this case): 我能够在2017-10获得所有物品(在这种情况下,只有一件):

print(df.set_index('date')['2017-10'])

According to the pandas documentation and this SO answer , I should be able to get all items from 2017-10 to 2017-11 (2 items in this case) with the following command but I'm getting an empty DataFrame: 根据pandas文档SO答案 ,我应该能够使用以下命令获取2017-102017-11所有项目(在这种情况下为2个项目),但是我得到的是一个空的DataFrame:

print(df.set_index('date')['2017-10':'2017-11'])

Any idea what I'm doing wrong here (I'm using pandas version 0.21.0 )? 知道我在这里做错了吗(我正在使用pandas版本0.21.0 )?

Moreover, is there an efficient way I can get all items in 2017-10 and 2017-12 (skipping 2017-11 )? 此外,有没有一种有效的方法来获取2017-102017-12所有商品(跳过2017-11 )? I've come up with the following solution but I shouldn't have to create new columns like so: 我想出了以下解决方案,但不必像这样创建新列:

df['month'] = df['date'].dt.month
df['year'] = df['date'].dt.year
print(df[((df.month==10) & (df.year==2017) | (df.month==12) & (df.year==2017))])

I reversed the order I was searching for the items so: 我颠倒了我搜索物品的顺序,所以:

import pandas as pd 

df = pd.DataFrame({'item': ['Subway', 'Pasta', 'Chipotle'],
                   'cost': [10, 5, 9],
                   'date': ['2017-12-01', '2017-11-01', '2017-10-01']})
df['date'] = pd.to_datetime(df['date'], format='%Y-%m-%d')

print(df.set_index('date')['2017-11':'2017-10'])

For your 'date' it went from high to low. 对于您的“约会”,它从高到低。 By switching them I got this output: 通过切换它们,我得到了以下输出:

            cost      item
date                      
2017-11-01     5     Pasta
2017-10-01     9  Chipotle

First use set_index() with DatetimeIndex . 首先将set_index()DatetimeIndex Then you can use the indexing approach you wanted. 然后,您可以使用所需的索引方法。

df.set_index(pd.DatetimeIndex(df.date), inplace=True)

df.sort_index().loc['2017-10':'2017-11']

            cost       date      item
date                                 
2017-10-01     9 2017-10-01  Chipotle
2017-11-01     5 2017-11-01     Pasta

With respect to your second question, you can also access the month property once you have a DatetimeIndex . 关于第二个问题,一旦拥有DatetimeIndex ,您还可以访问month属性。

df.loc[df.index.month.isin([10,12])]

            cost       date      item
date                                 
2017-12-01    10 2017-12-01    Subway
2017-10-01     9 2017-10-01  Chipotle

(For the second part, to index by year as well, add & df.index.year == 2017 ) (对于第二部分,也要按年份编制索引,请添加& df.index.year == 2017

An alternative approach may be to use boolean indexing. 另一种方法可能是使用布尔索引。

Here you provide statements that must be true in order for the rows to be returned. 在这里,您提供了必须为true的语句才能返回行。

For your second question, this would be: 对于第二个问题,这将是:

df_October_and_December = df.ix[((df['date'] >= '2017-10-01') & (df['date'] <= '2017-10-31')) | ((df['date'] >= '2017-12-01') & (df['date'] <= '2017-12-31')) ,:]

The more elegant version of what you want is: 您想要的更优雅的版本是:

df_October_and_December = df.ix[(df['date'].dt.month.isin([10,12])) ,:]

I tend to use .ix referencing given it's flexibility and refine to .loc or .iloc if the application allows. 考虑到它的灵活性,我倾向于使用.ix引用,如果应用程序允许的话,我会完善为.loc或.iloc。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM