从系列/列中查找第一个元素的索引（例如“True”）

Question

How do I find the index of an element (eg "True") in a series or a column?如何在系列或列中找到元素的索引（例如“True”）？

For example I have a column, where I want to identify the first instance where an event occur.例如，我有一列，我想在其中确定事件发生的第一个实例。 So I write it as所以我把它写成

Variable = df["Force"] < event

This then creates a boolen series of Data where it is False, until the first instance it becomes True.然后，这将创建一个布尔数据系列，其中它为 False，直到第一个实例变为 True。 How then do I find the index of data point?那么我如何找到数据点的索引？

Is there are better way?有没有更好的办法？

Answer 1

Use idxmax to find the first instance of the maximum value.使用idxmax查找最大值的第一个实例。 In this case, True is the maximum value.在这种情况下， True是最大值。

df['Force'].lt(event).idxmax()

Consider the sample df :考虑示例df ：

df = pd.DataFrame(dict(Force=[5, 4, 3, 2, 1]), list('abcde'))
df

   Force
a      5
b      4
c      3
d      2
e      1

The first instance of Force being less than 3 is at index 'd' . Force小于3的第一个实例位于索引'd' 。

df['Force'].lt(3).idxmax()
'd'

Be aware that if no value for Force is less than 3, then the maximum will be False and the first instance will be the first one.请注意，如果Force任何值都不小于 3，则最大值将为False ，第一个实例将是第一个实例。

Also consider the alternative argmax还可以考虑替代argmax

df.Force.lt(3).values.argmax()
3

It returns the position of the first instance of maximal value.它返回最大值的第一个实例的位置。 You can then use this to find the corresponding index value:然后您可以使用它来查找相应的index值：

df.index[df.Force.lt(3).values.argmax()]
'd'

Also, in the future, argmax will be a Series method.此外，在未来， argmax将是一个系列方法。

Answer 2

You can also try first_valid_index with where .您也可以使用where尝试first_valid_index 。

df = pd.DataFrame([[5], [4], [3], [2], [1]], columns=["Force"])
df.Force.where(df.Force < 3).first_valid_index()
3

where will replace the part that does not meet the condition with np.nan by default . where将默认用np.nan替换不满足条件的部分。 Then, we find the first valid index out of the series.然后，我们找到系列中的第一个有效索引。

Or this: select a subset of the item that you are interested in, here Variable == 1 .或者：选择您感兴趣的项目的一个子集，这里是Variable == 1 。 Then find the first item in its index.然后找到其索引中的第一项。

df = pd.DataFrame([[5], [4], [3], [2], [1]], columns=["Force"])
v = (df["Force"] < 3)
v[v == 1].index[0]

Bonus: if you need the index of first appearance of many kinds of items, you can use drop_duplicates .奖励：如果您需要多种项目的首次出现的索引，您可以使用drop_duplicates 。

df = pd.DataFrame([["yello"], ["yello"], ["blue"], ["red"],  ["blue"], ["red"]], columns=["Force"])  
df.Force.drop_duplicates().reset_index()
    index   Force
0   0       yello
1   2       blue
2   3       red

Some more work...还有一些工作...

df.Force.drop_duplicates().reset_index().set_index("Force").to_dict()["index"]
{'blue': 2, 'red': 3, 'yello': 0}

Answer 3

Below is a non-pandas solution which I find easy to adapt:下面是一个我觉得很容易适应的非熊猫解决方案：

import pandas as pd

df = pd.DataFrame(dict(Force=[5, 4, 3, 2, 1]), list('abcde'))

next(idx for idx, x in zip(df.index, df.Force) if x < 3)  # d

It works by iterating to the first result of a generator expression.它通过迭代生成器表达式的第一个结果来工作。

Pandas appears to perform poorly in comparison:相比之下，熊猫似乎表现不佳：

df = pd.DataFrame(dict(Force=np.random.randint(0, 100000, 100000)))

n = 99900

%timeit df['Force'].lt(n).idxmin()
# 1000 loops, best of 3: 1.57 ms per loop

%timeit df.Force.where(df.Force > n).first_valid_index()
# 100 loops, best of 3: 1.61 ms per loop

%timeit next(idx for idx, x in zip(df.index, df.Force) if x > n)
# 10000 loops, best of 3: 100 µs per loop

Answer 4

Here is an all-pandas solution that I consider a little neater than some of the other answers.这是一个全熊猫解决方案，我认为它比其他一些答案更简洁。 It is also able to handle the corner case where no value of the input series satisfies the condition.它还能够处理输入序列的值不满足条件的极端情况。

def first_index_ordered(mask):
    assert mask.index.is_monotonic_increasing
    assert mask.dtype == bool
    idx_min = mask[mask].index.min()
    return None if pd.isna(idx_min) else idx_min

col = "foo"
thr = 42
mask = df[col] < thr
idx_first = first_index_ordered(mask)

The above assumed that mask has a value-ordered, monotonically increasing index.上面假设mask有一个值有序的、单调递增的索引。 If this is not the case, we have to do a bit more:如果不是这种情况，我们必须做更多的事情：

def first_index_unordered(mask):
    assert mask.dtype == bool
    index = mask.index
    # This creates a RangeIndex, which is monotonic
    mask = mask.reset_index(drop=True)
    idx_min = mask[mask].index.min()
    return None if pd.isna(idx_min) else index[idx_min]

Of course, we can combine both cases in one function:当然，我们可以将这两种情况组合在一个函数中：

def first_index_where(mask):
    if mask.index.is_monotonic_increasing:
        return first_index_ordered(mask)
    else:
        return first_index_unordered(mask)

从系列/列中查找第一个元素的索引（例如“True”）

问题描述

4 个解决方案

解决方案1
12 2018-02-06 02:03:28

解决方案2
5 2018-02-06 02:27:18

解决方案3
2 2018-02-06 18:04:43

解决方案4
0 2021-04-15 04:55:53

从系列/列中查找第一个元素的索引（例如“True”）

问题描述

4 个解决方案

解决方案1 12 2018-02-06 02:03:28

解决方案2 5 2018-02-06 02:27:18

解决方案3 2 2018-02-06 18:04:43

解决方案4 0 2021-04-15 04:55:53

解决方案1
12 2018-02-06 02:03:28

解决方案2
5 2018-02-06 02:27:18

解决方案3
2 2018-02-06 18:04:43

解决方案4
0 2021-04-15 04:55:53