简体   繁体   中英

How to get the row index of the first cell with currency formatting in the last column of a dataframe using Python Pandas

Now I have a dataframe:

import pandas as pd

s1 = pd.Series(['a', 'b', 'c'])
s2 = pd.Series(['e', '$200', 'f'])
s3 = pd.Series(['e', '$300', '$400'])
s4 = pd.Series(['f', '$500', '$600'])
    
df = pd.DataFrame([list(s1), list(s2), list(s3), list(s4)],  columns =  ['A', 'B', 'C'])
df

    A   B   C
0   a   b   c
1   e   $200    f
2   e   $300    $400
3   f   $500    $600

I want to go through all of the cells in the last column and try to find the first cell with currency formatting. The first desired cell is df['C'][2]. The row index I want to return is 2.

IIUC, you could do the following:

df.iloc[:, -1].str.match(r'^\$\d+').idxmax()

Output

2

It works as follows:

  • df.iloc[:, -1] select the last column
  • .str.match(r'^\$\d+') use match to create a boolean array, True if matches the currency formatting.
  • .idxmax() in Python True -> 1 and False -> 0, so idxmax will find the maximum value in the array, if there are multiple it will return the first. See more on the documentation .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM