Find the index of first occurrence in DataFrame

Question

I have a dataframe which looks like this:

     0     1     2     3     4     5     6  
0    a(A)  b     c     c     d     a     a
1    b     h     w     k     d     c(A)  k
2    g     e(A)  s     g     h     s     f
3    f     d     s     h(A)  c     w     n
4    e     g     s     b     c     e     w

I want to get the index of the cell which contains (A) in each column.

I tried this code but the result doesn't reach my expectation.

df.apply(lambda x: (x.str.contains(r'(A)')==True).idxmax(), axis=0)

Result looks like this:

I think it returns the first index if there is no (A) in that column.

How should I fix it?

Answer 1

Use Series.where for set default missing value for overwrite default 0 value of DataFrame.idxmax :

mask = df.apply(lambda x: x.str.contains('A'))
s1 = mask.idxmax().where(mask.any())
print (s1)
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64

Answer 2

You could do what you're doing but explicitly check if the rows contain any matches:

In [51]: pred = df.applymap(lambda x: '(A)' in x)

In [52]: pred.idxmax() * np.where(pred.any(), 1, np.nan)
Out[52]:
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64

Or alternatively, using DataFrame.where directly:

In [211]: pred.where(pred).idxmax()
Out[211]:
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64

A slightly cheatier one-liner is to use DataFrame.where on the identity:

In [78]: df.apply(lambda x: x.str.contains('A')).where(lambda x: x).idxmax()
Out[78]:
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN

Answer 3

Add an if condition at the end of the apply :

>>> df.apply(lambda x: x.str.contains('A').idxmax() if 'A' in x[x.str.contains('A').idxmax()] else np.nan)
0    0.0
1    2.0
2    NaN
3    3.0
4    NaN
5    1.0
6    NaN
dtype: float64
>>>

Find the index of first occurrence in DataFrame

Question

3 answers

solution1
3 ACCPTED 2019-07-17 07:14:18

solution2
3 2019-07-17 07:22:52

solution3
1 2019-07-17 07:18:05

Find the index of first occurrence in DataFrame

Question

3 answers

solution1 3 ACCPTED 2019-07-17 07:14:18

solution2 3 2019-07-17 07:22:52

solution3 1 2019-07-17 07:18:05

solution1
3 ACCPTED 2019-07-17 07:14:18

solution2
3 2019-07-17 07:22:52

solution3
1 2019-07-17 07:18:05