根据 Pandas DF 中的每行条件获取列标题列表

Question

I was wondering if it were possible to get a list of column headers based on a condition.我想知道是否可以根据条件获取列标题列表。 For example, if the condition I have is to get a list of the column headers that had a "MATCH" value in each cell, it would output either a list of lists or a list of strings containing the header name, as such:例如，如果我的条件是获取每个单元格中具有“MATCH”值的列标题的列表，则它将 output 列表列表或包含 header 名称的字符串列表，如下所示：

["a, c", "b, d", "a, b, c, d", "a, d"]
or 
[["a", "c"], ["b", "d"], ["a", "b", "c", "d"], ["a", "d"]]

Thank you for any help!感谢您的任何帮助！

Answer 1

You could try with np.where :您可以尝试使用np.where ：

import pandas as pd
import numpy as np
df=pd.DataFrame({'a': ['match','mismatch','match'],'b': ['match','match','mismatch'],'c': ['mismatch','mismatch','match']})

print(df)

arr= np.where(df.eq('match'), df.columns, '').sum(axis=1)

print(arr)

Output: Output：

df
          a         b         c
0     match     match  mismatch
1  mismatch     match  mismatch
2     match  mismatch     match

arr
['ab' 'b' 'ac']

And then, to get the desired lists you could try:然后，要获得所需的列表，您可以尝试：

#first option
arr= np.where(df.eq('match'), df.columns, '').sum(axis=1)
arr=list(map(', '.join,arr))
print(arr)

#second option
arr= np.where(df.eq('match'), df.columns, '').sum(axis=1)
arr=list(map(list,arr))
print(arr)

Output: Output：

#first option
['a, b', 'b', 'a, c']

#second option
[['a', 'b'], ['b'], ['a', 'c']]

根据 Pandas DF 中的每行条件获取列标题列表

问题描述

1 个解决方案

解决方案1
0 2020-07-06 23:05:18

根据 Pandas DF 中的每行条件获取列标题列表

问题描述

1 个解决方案

解决方案1 0 2020-07-06 23:05:18

解决方案1
0 2020-07-06 23:05:18