简体   繁体   English

Python:如何遍历数据框中的列范围,检查特定值并将列名存储在列表中

[英]Python: How to iterate through a range of columns in a dataframe, check for specific values and store column name in a list

I am trying to iterate through a range of columns in a data frame and check for specific values in every row. 我试图遍历数据框中的一系列列,并检查每一行中的特定值。 the values should match against my list. 这些值应与我的列表匹配。 If there are matching values in each row with my list, then the column name where first instance where there is a match should append to my new list. 如果列表中的每一行都有匹配的值,则应将匹配项的第一个实例的列名称附加到我的新列表中。 How can achieve this? 如何做到这一点? I tried the following for loop but couldn't get it right. 我尝试了以下for循环,但无法正确执行。

I've looked at a few examples but couldn't find what i was looking for. 我看了几个例子,但找不到我想要的东西。

iterating through a column in dataframe and creating a list with name of the column + str 遍历数据框中的一列并创建一个带有列名+ str的列表

How to get the column name for a specific values in every row of a dataframe 如何获取数据框每一行中特定值的列名


import pandas as pd

random = {
        'col1': ['45c','5v','27','k22','wh','u5','36'],
        'col2': ['abc','bca','cab','bac','cab','aab','ccb'],
        'col3': ['xyz','zxy','yxz','zzy','yyx','xyx','zzz'],
        'col4': ['52','75c','k22','d2','3n','4b','cc'],
        'col5': ['tuv','vut','tut','vtu','uvt','uut','vvt'],
        'col6': ['la3','pl','5v','45c','3s','k22','9i']
        }

df = pd.DataFrame(random)

"""
Only 1 value from this list should match with the values in each row of the df
i.e if '45c' is in row 3, then it's a match. place the name of column where '45c' is found in the df in the new list
"""
list = ['45c','5v','d2','3n','k22',]

"""
empty list that should be populated with df column names if there is a single match
"""
rand = []
for row in df.iloc[:,2:5]:
    for x in row:
        if df[x] in list:
            rand.append(df[row][x].columns)
            break

print(rand)

#this is what my df looks like when I print it
  col1 col2 col3 col4 col5 col6
0  45c  abc  xyz   52  tuv  la3
1   5v  bca  zxy  75c  vut   pl
2   27  cab  yxz  k22  tut   5v
3  k22  bac  zzy   d2  vtu  45c
4   wh  cab  yyx   3n  uvt   3s
5   u5  aab  xyx   4b  uut  k22
6   36  ccb  zzz   cc  vvt   9i

the output I was hoping to get is as follows: 我希望得到的输出如下:

rand = ['col1','col4','col1','col6']

First compare all values with DataFrame.isin and get column of first matched value with DataFrame.idxmax , but because if no match it return first column is added condition with DataFrame.any for test it: 先用比较所有值DataFrame.isin并获得与第一个匹配值的列DataFrame.idxmax ,而是因为如果没有匹配它返回第一列添加条件与DataFrame.any的测试:

L = ['45c','5v','d2','3n','k22']
m = df.isin(L)
out = np.where(m.any(1), m.idxmax(axis=1), 'no match').tolist()
print (out)
['col1', 'col1', 'col4', 'col1', 'col4', 'col6', 'no match']

If need only matched values: 如果仅需要匹配的值:

out1 = m.idxmax(axis=1)[m.any(1)].tolist()
print (out1)
['col1', 'col1', 'col4', 'col1', 'col4', 'col6']

Detail : 详细说明

print (m)
    col1   col2   col3   col4   col5   col6
0   True  False  False  False  False  False
1   True  False  False  False  False  False
2  False  False  False   True  False   True
3   True  False  False   True  False   True
4  False  False  False   True  False  False
5  False  False  False  False  False   True
6  False  False  False  False  False  False

Loop solution is possible, but not recommended : 可以使用循环解决方案,但不建议这样做

rand = []
for i, row in df.iterrows():
    for x in row:
        if x in L:
            rand.append(i)
print(rand)
['col1', 'col1', 'col4', 'col1', 'col4', 'col6']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何遍历 pandas dataframe 中的两列以将值添加到列表中 - How to iterate through two columns in a pandas dataframe to add the values to a list 如何遍历 dataframe 中的列并更新值? - How to iterate through columns in a dataframe and update the values? 遍历 dataframe 中的所有行并检查所有列值是否在列表中 - Iterate over all rows in dataframe and check all column values are in list python:迭代列表中的特定范围 - python: iterate a specific range in a list Pandas 遍历一个数据帧,将行值和列值连接到一个关于特定列值的新数据帧中 - Pandas-iterate through a dataframe concatenating row values and column values into a new dataframe with respect to a specific column value 如何遍历 dataframe 的列? - How to iterate through columns of the dataframe? 遍历 dataframe python 中的列中的字符串列表 - Iterate through a list of strings from a column in dataframe python 如何遍历数据框中的列并同时更新两个新列? - How to iterate through a column in dataframe and update two new columns simultaneously? Python Dataframe:如何检查特定列的元素 - Python Dataframe: How to check specific columns for elements Openpyxl Python-如何遍历大文件并在特定列中返回值 - Openpyxl Python - How to iterate through large file and return values in specific columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM