简体   繁体   中英

How to fetch a column header if a particular condition is met based on row and column value of the dataframe?

I have a dataframe like this:

col1    x   y   z
A      yes  no  yes
B      no   no  yes
C      no   yes no
D      yes  no  yes
E      no   no  yes
F      yes  yes no

I would like to select data like this, If my criteria is to find all yes for A , I should get [x,z] , ie the values for A which ever is yes

If B , [z] C should give [y]

What to do?

First create index by col1 column for indexing by loc , compare by value and last get index values to list :

df = df.set_index('col1')

def get_val(df, idx, val):
    a = df.loc[idx].eq(val)
    return a.index[a].tolist()

print (get_val(df, 'A', 'yes'))
['x', 'z']

print (get_val(df, 'B', 'yes'))
['z']

print (get_val(df, 'C', 'yes'))
['y']

You could use

In [499]: df.eq('yes').dot(df.columns)[df.col1.eq('A')]
Out[499]:
0    xz
dtype: object

In [500]: df.eq('yes').dot(df.columns)[df.col1.eq('B')]
Out[500]:
1    z
dtype: object

In [501]: df.eq('yes').dot(df.columns)[df.col1.eq('C')]
Out[501]:
2    y
dtype: object

Here is another one creating a function:

df.set_index('col1', inplace=True)

def find_yes(df, x):
    return df.columns[df.loc[x] == 'yes'].tolist()

full example

import pandas as pd

data = '''\
col1    x   y   z
A      yes  no  yes
B      no   no  yes
C      no   yes no
D      yes  no  yes
E      no   no  yes
F      yes  yes no'''

fileobj = pd.compat.StringIO(data)
df = pd.read_csv(fileobj, sep='\s+')

df.set_index('col1', inplace=True)

def find_yes(df, x):
    return df.columns[df.loc[x] == 'yes'].tolist()

print(find_yes(df, 'A'))
print(find_yes(df, 'B'))

Returns:

['x', 'z']
['z']

One more option for you - how about using melt then groupby :

from io import StringIO

import pandas as pd

data = StringIO('''col1    x   y   z
A      yes  no  yes
B      no   no  yes
C      no   yes no
D      yes  no  yes
E      no   no  yes
F      yes  yes no''')

df = pd.read_csv(data, sep='\s+')

m = df.melt(id_vars='col1')
matches = m[m['value'] == 'yes'].groupby('col1')\
                                .agg({'variable': list})

this gives the following dataframe:

     variable
col1         
A      [x, z]
B         [z]
C         [y]
D      [x, z]
E         [z]
F      [x, y]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM