簡體   English   中英

基於日期的Pythonic方式過濾DataFrame示例

[英]Pythonic way to filter a DataFrame based on dates example

In [90]: list_dates = [datetime.date(2014,2,2),datetime.date(2015,2,2), datetime.date(2013,4,5)]

In [91]: df = DataFrame(list_dates, columns=['Date'])

In [92]: df
Out[92]: 
         Date
0  2014-02-02
1  2015-02-02
2  2013-04-05

現在,我想獲得一個僅包含2014年和2013年日期的新DataFrame:

In [93]: result = DataFrame([date for date in df.Date if date.year in (2014,2013)])

In [94]: result
Out[94]: 
            0
0  2014-02-02
1  2013-04-05

那行得通,給了我想要的DataFrame。 為什么不起作用:

In [95]: result1 = df[df.Date.map(lambda x: x.year) == 2014 or p.Date.map(lambda x: x.year) == 2013]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-95-86f01906c89b> in <module>()
----> 1 result1 = df[df.Date.map(lambda x: x.year) == 2014 or p.Date.map(lambda x: x.year) == 2013]

/home/marcos/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in __nonzero__(self)
    690         raise ValueError("The truth value of a {0} is ambiguous. "
    691                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 692                          .format(self.__class__.__name__))
    693 
    694     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

或以下內容:

In [96]: df['year'] = df.Date.map(lambda x: x.year)

In [97]:     result2 = df[df.year in (2014, 2013)]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-97-814358a4edff> in <module>()
----> 1 result2 = df[df.year in (2014, 2013)]

/home/marcos/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in __nonzero__(self)
    690         raise ValueError("The truth value of a {0} is ambiguous. "
    691                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 692                          .format(self.__class__.__name__))
    693 
    694     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我認為問題是當我使用'in'命令時,我試圖檢查整個Series是否在一個元組中。 但是,如何使評估成為元素分類,以便獲得所需的結果?

我使用to_datetime將日期轉換為datetime對象,然后允許您使用dt訪問器訪問year屬性,然后我們可以調用isin並傳遞感興趣的年份列表以過濾df:

In [68]:

df['Date'] = pd.to_datetime(df['Date'])
In [69]:

df[df['Date'].dt.year.isin([2013,2014])]
Out[69]:
        Date
0 2014-02-02
2 2013-04-05

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM