简体   繁体   English

在DataFrame中查找第一次出现的索引

[英]Find the index of first occurrence in DataFrame

I have a dataframe which looks like this: 我有一个如下所示的数据框:

     0     1     2     3     4     5     6  
0    a(A)  b     c     c     d     a     a
1    b     h     w     k     d     c(A)  k
2    g     e(A)  s     g     h     s     f
3    f     d     s     h(A)  c     w     n
4    e     g     s     b     c     e     w

I want to get the index of the cell which contains (A) in each column. 我想得到每列中包含(A)的单元格的索引。

0   0
1   2
2  NaN
3   3
4  NaN
5   1 
6  NaN

I tried this code but the result doesn't reach my expectation. 我尝试了这段代码,但结果并没有达到我的预期。

df.apply(lambda x: (x.str.contains(r'(A)')==True).idxmax(), axis=0)

Result looks like this: 结果如下所示:

0   0
1   2
2   0
3   3
4   0
5   1 
6   0

I think it returns the first index if there is no (A) in that column. 我认为如果该列中没有(A) ,它将返回第一个索引。

How should I fix it? 我该如何解决?

Use Series.where for set default missing value for overwrite default 0 value of DataFrame.idxmax : 使用Series.where为默认设置缺少覆盖默认值0的值DataFrame.idxmax

mask = df.apply(lambda x: x.str.contains('A'))
s1 = mask.idxmax().where(mask.any())
print (s1)
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64

You could do what you're doing but explicitly check if the rows contain any matches: 您可以执行您正在执行的操作但显式检查行是否包含任何匹配项:

In [51]: pred = df.applymap(lambda x: '(A)' in x)

In [52]: pred.idxmax() * np.where(pred.any(), 1, np.nan)
Out[52]:
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64

Or alternatively, using DataFrame.where directly: 或者,直接使用DataFrame.where

In [211]: pred.where(pred).idxmax()
Out[211]:
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64

A slightly cheatier one-liner is to use DataFrame.where on the identity: 稍微有些单行的是在身份上使用DataFrame.where

In [78]: df.apply(lambda x: x.str.contains('A')).where(lambda x: x).idxmax()
Out[78]:
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN

Add an if condition at the end of the apply : apply的末尾添加if条件:

>>> df.apply(lambda x: x.str.contains('A').idxmax() if 'A' in x[x.str.contains('A').idxmax()] else np.nan)
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64
>>> 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 查找熊猫数据框中首次出现的特定部分字符串的索引位置 - Find index location of first occurrence of a specific partial string in pandas dataframe 使用 DateTime 索引在 Pandas DataFrame 中查找每天第一次和最后一次出现值的索引位置 - Find index location of first and last occurrence of a value per day in a Pandas DataFrame with a DateTime index 获取dataframe中某列第一次出现的索引 - Obtaining the index of the first occurrence in a column in a dataframe 在多索引数据帧中查找每个索引的第一次出现 - Finding the first occurrence per index in Multiindex dataframe Python:找到第一个x的索引,然后找到第二个x的索引 - Python: Find index of the first occurrence of x and then the index of the second occurrence of x 在 Pandas 数据框中找到第一次出现的值(从值列表中)并返回该行的索引 - find the first occurrence of a value (from a list of values)in a pandas dataframe and return the index of the row 如何在Python数据帧(包含日期)的单元格中查找第一次匹配的行索引 - How to find the row index of the first occurrence of a match in a cell in Python dataframe (containing date) 在数据框中查找与多个条件匹配的第一个匹配项 - Find the first occurrence that matches multiple conditions in a dataframe 在排序列表中查找首次出现的索引 - Find index of first occurrence in sorted list 在索引之前查找子字符串的第一次出现 - Find First Occurrence of Substring Before Index
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM