简体   繁体   中英

Count how many cells are between the last value in the dataframe and the end of the row

I'm using the pandas library in Python.

I have a data frame:

    0   1   2   3   4

0   0   0   0   1   0

1   0   0   0   0   1

2   0   0   1   0   0

3   1   0   0   0   0

4   0   0   1   0   0

5   0   1   0   0   0

6   1   0   0   1   1

Is it possible to create a new column that is a count of the number of cells that are empty between the end of the row and the last value above zero? Example data frame below:

    0   1   2   3   4   Value

0   0   0   0   1   0   1

1   0   0   0   0   1   0

2   0   0   1   0   0   2

3   1   0   0   0   0   4

4   0   0   1   0   0   2

5   0   1   0   0   0   3

6   1   0   0   1   1   0

Use:

df['new'] = df.iloc[:, ::-1].cumsum(axis=1).eq(0).sum(axis=1)
print (df)
   0  1  2  3  4  new
0  0  0  0  1  0    1
1  0  0  0  0  1    0
2  0  0  1  0  0    2
3  1  0  0  0  0    4
4  0  0  1  0  0    2
5  0  1  0  0  0    3
6  1  0  0  1  1    0

Details :

First change order of columns by DataFrame.loc and slicing:

print (df.iloc[:, ::-1])
   4  3  2  1  0
0  0  1  0  0  0
1  1  0  0  0  0
2  0  0  1  0  0
3  0  0  0  0  1
4  0  0  1  0  0
5  0  0  0  1  0
6  1  1  0  0  1

Then use cumulative sum per rows by DataFrame.cumsum :

print (df.iloc[:, ::-1].cumsum(axis=1))
   4  3  2  1  0
0  0  1  1  1  1
1  1  1  1  1  1
2  0  0  1  1  1
3  0  0  0  0  1
4  0  0  1  1  1
5  0  0  0  1  1
6  1  2  2  2  3

Compare only 1 values by DataFrame.eq :

print (df.iloc[:, ::-1].cumsum(axis=1).eq(0))
       4      3      2      1      0
0   True  False  False  False  False
1  False  False  False  False  False
2   True   True  False  False  False
3   True   True   True   True  False
4   True   True  False  False  False
5   True   True   True  False  False
6  False  False  False  False  False

And last count them per rows by sum :

print (df.iloc[:, ::-1].cumsum(axis=1).eq(0).sum(axis=1))
0    1
1    0
2    2
3    4
4    2
5    3
6    0
dtype: int64

using argmax

df['value'] = df.apply(lambda x: (x.iloc[::-1] == 1).argmax(),1)

##OR

using np.where

df['Value'] = np.where(df.iloc[:,::-1] == 1,True,False).argmax(1)

   0  1  2  3  4  Value
0  0  0  0  1  0      1
1  0  0  0  0  1      0
2  0  0  1  0  0      2
3  1  0  0  0  0      4
4  0  0  1  0  0      2
5  0  1  0  0  0      3
6  1  0  0  1  1      0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM