Python：如何遍历数据框中的列范围，检查特定值并将列名存储在列表中

Question

I am trying to iterate through a range of columns in a data frame and check for specific values in every row. 我试图遍历数据框中的一系列列，并检查每一行中的特定值。 the values should match against my list. 这些值应与我的列表匹配。 If there are matching values in each row with my list, then the column name where first instance where there is a match should append to my new list. 如果列表中的每一行都有匹配的值，则应将匹配项的第一个实例的列名称附加到我的新列表中。 How can achieve this? 如何做到这一点？ I tried the following for loop but couldn't get it right. 我尝试了以下for循环，但无法正确执行。

I've looked at a few examples but couldn't find what i was looking for. 我看了几个例子，但找不到我想要的东西。

iterating through a column in dataframe and creating a list with name of the column + str 遍历数据框中的一列并创建一个带有列名+ str的列表

How to get the column name for a specific values in every row of a dataframe 如何获取数据框每一行中特定值的列名


import pandas as pd

random = {
        'col1': ['45c','5v','27','k22','wh','u5','36'],
        'col2': ['abc','bca','cab','bac','cab','aab','ccb'],
        'col3': ['xyz','zxy','yxz','zzy','yyx','xyx','zzz'],
        'col4': ['52','75c','k22','d2','3n','4b','cc'],
        'col5': ['tuv','vut','tut','vtu','uvt','uut','vvt'],
        'col6': ['la3','pl','5v','45c','3s','k22','9i']
        }

df = pd.DataFrame(random)

"""
Only 1 value from this list should match with the values in each row of the df
i.e if '45c' is in row 3, then it's a match. place the name of column where '45c' is found in the df in the new list
"""
list = ['45c','5v','d2','3n','k22',]

"""
empty list that should be populated with df column names if there is a single match
"""
rand = []
for row in df.iloc[:,2:5]:
    for x in row:
        if df[x] in list:
            rand.append(df[row][x].columns)
            break

print(rand)

#this is what my df looks like when I print it
  col1 col2 col3 col4 col5 col6
0  45c  abc  xyz   52  tuv  la3
1   5v  bca  zxy  75c  vut   pl
2   27  cab  yxz  k22  tut   5v
3  k22  bac  zzy   d2  vtu  45c
4   wh  cab  yyx   3n  uvt   3s
5   u5  aab  xyx   4b  uut  k22
6   36  ccb  zzz   cc  vvt   9i

the output I was hoping to get is as follows: 我希望得到的输出如下：

rand = ['col1','col4','col1','col6']

Answer 1

First compare all values with DataFrame.isin and get column of first matched value with DataFrame.idxmax , but because if no match it return first column is added condition with DataFrame.any for test it: 先用比较所有值DataFrame.isin并获得与第一个匹配值的列DataFrame.idxmax ，而是因为如果没有匹配它返回第一列添加条件与DataFrame.any的测试：

L = ['45c','5v','d2','3n','k22']
m = df.isin(L)
out = np.where(m.any(1), m.idxmax(axis=1), 'no match').tolist()
print (out)
['col1', 'col1', 'col4', 'col1', 'col4', 'col6', 'no match']

If need only matched values: 如果仅需要匹配的值：

out1 = m.idxmax(axis=1)[m.any(1)].tolist()
print (out1)
['col1', 'col1', 'col4', 'col1', 'col4', 'col6']

Detail : 详细说明 ：

print (m)
    col1   col2   col3   col4   col5   col6
0   True  False  False  False  False  False
1   True  False  False  False  False  False
2  False  False  False   True  False   True
3   True  False  False   True  False   True
4  False  False  False   True  False  False
5  False  False  False  False  False   True
6  False  False  False  False  False  False

Loop solution is possible, but not recommended : 可以使用循环解决方案，但不建议这样做：

rand = []
for i, row in df.iterrows():
    for x in row:
        if x in L:
            rand.append(i)
print(rand)
['col1', 'col1', 'col4', 'col1', 'col4', 'col6']

Python：如何遍历数据框中的列范围，检查特定值并将列名存储在列表中

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-08-29 14:27:02

Python：如何遍历数据框中的列范围，检查特定值并将列名存储在列表中

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-08-29 14:27:02

解决方案1
2 已采纳 2019-08-29 14:27:02