简体   繁体   English

Map 多列来自 pandas dataframe 到字典并有条件地将值返回到新列

[英]Map multiple columns from pandas dataframe to a dictionary and conditionally return a value to a new column

I have a pandas dataframe with multiple columns and a dictionary with keys that correspond to the column names.我有一个 pandas dataframe 有多个列和一个字典,其中的键对应于列名。 I want to check the column values with respect to the dictionary values and return either a 'yes' or 'no' based on whether the column value meets a "greater than or equal to" condition.我想根据字典值检查列值,并根据列值是否满足“大于或等于”条件返回“是”或“否”。

Example:例子:

import pandas as pd
dfdict = {'col1': [1,2,3], 'col2':[2,3,4], 'col3': [3.2, 4.2, 7.7]}
checkdict = {'col1': 2, 'col2': 3, 'col3': 1.5}
df = pd.DataFrame(dfdict)

For each column, for each row, check whether the row value is greater than or equal to than the value in the dictionary.对于每一列,对于每一行,检查行值是否大于或等于字典中的值。 For that row, if any of the columns meet the condition, return a "yes" to a newly created column, else return a "no".对于该行,如果任何列满足条件,则对新创建的列返回“是”,否则返回“否”。

What I've tried:我试过的:

def checkcond(element):
    if not math.isnan(element):
        x = checkdict[element]
        return 1 if element >= x else 0
    else:
        pass

df['test'] = df.applymap(checkcond)

but of course this doesn't work because the row value is supplied to the checkcond function rather than the column name and row.但当然这不起作用,因为行值提供给 checkcond function 而不是列名和行。

I also tried:我也试过:

df['test'] = pd.np.where(df[['col1', 'col2', 'col3']].ge(0).any(1, skipna=True), 'Y', 'N')

But that will only take one value for the "ge" condition, whereas I want to check the row value with respect to the dictionary value for each of the columns.但这只会为“ge”条件取一个值,而我想根据每一列的字典值检查行值。

Any suggestions would be appreciated!任何建议,将不胜感激!

Convert your dictionary to Series and perform a simple comparison:将您的字典转换为 Series 并执行简单的比较:

df.ge(pd.Series(checkdict)).replace({True: 'yes', False: 'no'})

output: output:

  col1 col2 col3
0   no   no  yes
1   no   no  yes
2  yes  yes  yes

To get aggregation per row:要获得每行的聚合:

df['any'] = df.ge(pd.Series(checkdict)).any(1).map({True: 'yes', False: 'no'})

output: output:

   col1  col2  col3  any
0     1     2   3.2  yes
1     2     3   4.2  yes
2     3     4   7.7  yes

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将值从嵌套字典映射到数据帧中的多列或从 3 列数据帧映射到主数据帧? - How to map value from nested dictionary to multiple columns in dataframe or from 3 column dataframe to main dataframe? 从python pandas dataframe中的字典列中解析多个列 - Parse multiple columns from a dictionary column in python pandas dataframe Pandas:map 列在多列上使用字典 - Pandas: map column using a dictionary on multiple columns Map 多列从 pandas DataFrame 变成一列 - Map multiple columns from pandas DataFrame into one column 带有字典的新 pandas 列中的 map 值 - map value in a new pandas column with a dictionary 如何将多列中的最大值返回到 pandas df 中的新列 - How to return the highest value from multiple columns to a new column in a pandas df 如果等于另一列中的值,则熊猫从多列返回值 - Pandas return value from multiple columns if equal to value in another column 使用包含字典的 Pandas 列在 DataFrame 中创建新列 - Make new columns in a DataFrame using a pandas column having dictionary inside Pandas DataFrame将字典valueassign列应用或映射到MultiIndex值的函数 - Pandas DataFrame apply or map dictionary valueassign column to function of MultiIndex value 将地图/字典的Spark Dataframe列展平为多个列 - Flatten Spark Dataframe column of map/dictionary into multiple columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM