简体   繁体   English

将dict键映射到pandas数据帧列(如果它们已关闭)

[英]Mapping dict keys to column of pandas dataframe if they're close

I'm working with probabilities that correspond to certain categories and I would like to map them to the categories of interest in a new column of a pandas DataFrame. 我正在处理与某些类别相对应的概率,我想将它们映射到pandas DataFrame的新列中感兴趣的类别。

I would normally use pandas.Series.map for such a task but the probabilities have been truncated when processed in another language and so this doesn't work. 我通常pandas.Series.map用于此类任务,但在使用其他语言处理时概率已被截断,因此这不起作用。

I would like to know if it's possible to combine pd.Series.map and np.isclose together so that the following example will work as needed? 我想知道是否可以将pd.Series.mapnp.isclose组合在一起,以便下面的示例可以根据需要使用? Any alternative approaches would be appreciated also! 任何替代方法也将受到赞赏!

import pandas as pd

df = pd.DataFrame({
    'a': [1, 2, 3],
    'prob': np.round([0.6**(1/30.), 0.9**(1/10.), 0.8**(1/20.)], decimals = 4)
    })

prob_dict = {
    0.9**(1/10.): 'catA', 
    0.6**(1/30.): 'catB', 
    0.8**(1/20.): 'catC'}

df['cat'] = df.prob.map(prob_dict)

>> df
>>    a      prob  cat
>> 0  1  0.983117  NaN
>> 1  2  0.989519  NaN
>> 2  3  0.988905  NaN

My required/needed output is ... 我的所需/需要输出是......

>> df
>>    a      prob  cat
>> 0  1  0.983117  catB
>> 1  2  0.989519  catA
>> 2  3  0.988905  catC

You have your keys and values mixed up. 你的钥匙和价值观混在一起。

prob_dict = {v: k for k, v in prob_dict.items()}

df['cat'] = df.prob.map(prob_dict)
print(df)

   a      prob   cat
0  1  0.983117  catB
1  2  0.989519  catA
2  3  0.988905  catC

You can use np.isclose along with a specified absolute threshold of a value to be compared against (Here: atol=0.0001 is chosen) after reshaping the values in the Prob column to take on 2-D axis. 在将Prob列中的值重新整形为2-D轴之后,可以使用np.isclose以及要比较的值的指定绝对阈值(此处:选择atol=0.0001 )。

These get compared to the .values() method of the dictionary and returns True if a close match is found. 这些与字典的.values()方法进行比较,如果找到匹配匹配则返回True

cond = np.isclose(df.prob.values[:, None], list(prob_dict.keys()), atol=10**-4)
indi = np.argwhere(cond)[:, 1]     # Get all column indices fulfilling the above condition
df['cat'] = np.array(list(prob_dict.values()))[indi]  # Let keys take on newly imputed slice

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用唯一列值作为键转换Pandas Dataframe to_dict() - Convert Pandas Dataframe to_dict() with unique column values as keys Pandas Dataframe to_dict()以唯一列值作为键 - Pandas Dataframe to_dict() with unique column values as keys 如何在将 dict 转换为 pandas DataFrame 时设置列标题(其中列名与 dict 键不匹配)? - How to set column headers while converting dict to pandas DataFrame (where column names do not match dict keys)? 将数据框列中的集映射到另一个数据框/字典 - Mapping set in dataframe column to another dataframe / dict 清理 pandas dataframe 中的 dict 列 - Clean dict column in pandas dataframe 熊猫数据框按列进行分组 - Pandas Dataframe to dict grouping by column 从dict到dict在Pandas Dataframe中的列顺序 - Column Order in Pandas Dataframe from dict of dict 从嵌套dict创建pandas数据帧,外键为df索引和内键列标题 - Create pandas dataframe from nested dict with outer keys as df index and inner keys column headers 熊猫和字典:将Dict转换为DataFrame并将值中的内部键用作DataFrame列标题 - Pandas and Dictionary: Convert Dict to DataFrame and use inner keys in values as DataFrame column headers 将字典值增量添加到pandas DataFrame。 具有dict键的列名称的DataFrame - Incremental addition of dictionary values to a pandas DataFrame. DataFrame with column names of dict keys
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM