在Pandas中對DataFrame進行排序和切片

Question

我有一個如下所示的數據框：

    detaildate  detailquantity
0   2012-02-09  7.0
1   2011-05-27  -1.0
2   2011-05-04  -2.0
3   2012-03-19  -2.0
4   2012-03-18  -3.0

我想首先按detaildate對上面的數據detaildate進行排序，然后將數據detaildate從detailquantity的第一個正值detailquantity到最后一個索引。

結果數據幀應如下所示：

    detaildate  detailquantity
0   2012-02-09  7.0
4   2012-03-18  -3.0
3   2012-03-19  -2.0

我正在嘗試下面的代碼，但是最后導致一個空的數據框，我無法弄清楚為什么

df.sort_values(by='detaildate', inplace=True)
df = df[df[df['detailquantity'] > 0].first_valid_index():]

上面的代碼有什么問題？

Answer 1

使用帶有布爾掩碼的Series.cumsum並測試所有大於0值，如果所有負值，解決方案也可以正常工作：

df.sort_values(by='detaildate', inplace=True)

df = df[(df['detailquantity'] > 0).cumsum() > 0]
print (df)
   detaildate  detailquantity
0  2012-02-09             7.0
4  2012-03-18            -3.0
3  2012-03-19            -2.0

應該通過創建唯一索引來更改您的解決方案，但必須至少匹配一個值：

df.sort_values(by='detaildate', inplace=True)
df = df.reset_index(drop=True)

df = df.loc[(df['detailquantity'] > 0).idxmax():]
print (df)
   detaildate  detailquantity
2  2012-02-09             7.0
3  2012-03-18            -3.0
4  2012-03-19            -2.0

numpy中的另一種選擇：

df.sort_values(by='detaildate', inplace=True)

df = df.iloc[(df['detailquantity'].values > 0).argmax():]
print (df)
   detaildate  detailquantity
0  2012-02-09             7.0
4  2012-03-18            -3.0
3  2012-03-19            -2.0

在Pandas中對DataFrame進行排序和切片

問題描述

1 個解決方案

解決方案1
2 已采納 2019-05-15 13:53:32

在Pandas中對DataFrame進行排序和切片

問題描述

1 個解決方案

解決方案1 2 已采納 2019-05-15 13:53:32

解決方案1
2 已采納 2019-05-15 13:53:32