熊猫在范围之间找到idxmax（）

Question

我有这个时间序列df：

                    Current
2018-09-01 00:00      -0.01
2018-09-01 00:01      -0.03
2018-09-01 00:02      -0.01
2018-09-01 00:03       0.03
2018-09-01 00:04      -0.02
2018-09-01 00:05      -0.04
2018-09-01 00:06       0.05

我试图找到Current值的第一个实例> 0.01。 如果我用

findValue = (df['Current'] > 0.01).idxmax()

我会回来：

2018-09-01 00:03 0.03 。

但是，我想忽略前5行，因此返回值应为

 2018-09-01 00:06       0.05

我试过使用shift（）：

findValue = (df['Current'] > 0.01).shift(5).idxmax()

但这似乎不正确...

Answer 1

您可以通过索引使用iloc来查找所有列，而无需前5列：

N = 5
findValue = (df['Current'].iloc[N:] > 0.01).idxmax()
print (findValue)
2018-09-01 00:06

另一个想法是通过np.arange和np.arange长度创建另一个布尔掩码，并由&链接：

m1 = df['Current'] > 0.01
m2 = np.arange(len(df)) >= 5
findValue = (m1 & m2).idxmax()
print (findValue)
2018-09-01 00:06

如果需要按DatetimeIndex的值选择：

findValue = (df['Current'].loc['2018-09-01 00:05':] > 0.01).idxmax()
print (findValue)
2018-09-01 00:06:00

m1 = df['Current'] > 0.01
m2 = df.index >= '2018-09-01 00:05'
findValue = (m1 & m2).idxmax()
print (findValue)
2018-09-01 00:06:00

但：

idxmax返回第一个False值，如果不匹配任何值：

m1 = df['Current'] > 5.01
m2 = np.arange(len(df)) >= 5
findValue = (m1 & m2).idxmax()

print (findValue)
2018-09-01 00:00:00

可能的解决方法是使用next与iter ：

m1 = df['Current'] > 5.01
m2 = np.arange(len(df)) >= 5
findValue = next(iter(df.index[m1 & m2]), 'no exist')

print (findValue)
no exist

如果性能很重要，请检查此不错的@jpp Q / A- 有效地返回array中第一个满足条件的值的索引。

熊猫在范围之间找到idxmax（）

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-01-22 09:57:50

熊猫在范围之间找到idxmax（）

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-01-22 09:57:50

解决方案1
1 已采纳 2019-01-22 09:57:50