简体   繁体   中英

Most efficient way to return Column name in a pandas df

I have a pandas df that contains 4 different columns . For every row theres a value thats of importance. I want to return the Column name where that value is displayed. So for the df below I want to return the Column name when the value 2 is labelled.

d = ({
    'A' : [2,0,0,2],     
    'B' : [0,0,2,0],
    'C' : [0,2,0,0],            
    'D' : [0,0,0,0], 
    })

df = pd.DataFrame(data=d)

Output:

   A  B  C  D
0  2  0  0  0
1  0  0  2  0
2  0  2  0  0
3  2  0  0  0

So it would be A,C,B,A

I'm doing this via

m = (df == 2).idxmax(axis=1)[0]

And then changing the row. But this isn't very efficient.

I'm also hoping to produce the output as a Series from pandas df

Use DataFrame.dot :

df.astype(bool).dot(df.columns).str.cat(sep=',')

Or,

','.join(df.astype(bool).dot(df.columns))

'A,C,B,A'

Or, as a list:

df.astype(bool).dot(df.columns).tolist()
['A', 'C', 'B', 'A']

...or a Series:

df.astype(bool).dot(df.columns)

0    A
1    C
2    B
3    A
dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM