简体   繁体   English

如何使用多键字典在数据框中创建新列

[英]How to create a new column in a dataframe using a multikey dictionary

I have a multikey dictionary that I would like to use to create a new column in a dataframe.我有一个多键字典,我想用它在数据框中创建一个新列。 Doing so with a single key dataframe is quite easy but I am stumped as to what the correct syntax is to send two values to the dictionary.使用单个关键数据帧执行此操作非常简单,但我不知道将两个值发送到字典的正确语法是什么。

I have been able to use a single key dictionary using map, get, and apply (map example)我已经能够使用 map、get 和 apply 来使用单个键字典(地图示例)

    import pandas as pd      
    df = pd.DataFrame(data = {'Col1': [1, 2, 3, 4], 'Col2': ['A', 'B', 'C', 'D']})

    single_dict = {1: 'This', 2: 'is', 3: 'pretty', 4: 'easy'}

    df['newcol_a'] = df['Col1'].map(single_dict)

    print(df)```

which returns the expected"返回预期的“

    Col1 Col2 newcol_a
    0     1    A     This
    1     2    B       is
    2     3    C   pretty
    3     4    D     easy

But when I create a multikey dictionary such as但是当我创建一个多键字典时,比如

dbl_dict = {1: {'A': 'THIS', 'B': 'blah', 'C': 'blah', 'D': 'blah'},
            2: {'A': 'blah', 'B': 'HAS' , 'C': 'blah', 'D': 'blah'},
            3: {'A': 'blah', 'B': 'blah', 'C': 'ME'  , 'D': 'blah'},
            4: {'A': 'blah', 'B': 'blah', 'C': 'blah', 'D': 'STUMPED'},}

I am able to call it using 'get'我可以使用“get”来调用它

dbl_dict.get(1, {}).get('A', 'Other')
Out[5]: 'THIS'      

But I cannot figure out the syntax (tried about 40 different things, such as df['newcol_b'] = df[['Col1', 'Col2']].map(dbl_dict) ) to get the desired results:但我无法弄清楚语法(尝试了大约 40 种不同的东西,例如df['newcol_b'] = df[['Col1', 'Col2']].map(dbl_dict) )以获得所需的结果:

    Col1 Col2 newcol_a
    0     1    A     THIS
    1     2    B      HAS
    2     3    C       ME
    3     4    D  STUMPED

map does not know how to handle a nested dict. map不知道如何处理嵌套的 dict。 If you insist on using this dict you can use apply on the entire dataframe but you'd have to create a custom mapping function:如果你坚持使用这个 dict 你可以在整个数据帧上使用apply但你必须创建一个自定义映射函数:

import pandas as pd

df = pd.DataFrame(data={'Col1': [1, 2, 3, 4], 'Col2': ['A', 'B', 'C', 'D']})
dbl_dict = {1: {'A': 'THIS', 'B': 'blah', 'C': 'blah', 'D': 'blah'},
            2: {'A': 'blah', 'B': 'HAS', 'C': 'blah', 'D': 'blah'},
            3: {'A': 'blah', 'B': 'blah', 'C': 'ME', 'D': 'blah'},
            4: {'A': 'blah', 'B': 'blah', 'C': 'blah', 'D': 'STUMPED'}}

df['new_col'] = df.apply(lambda s: dbl_dict.get(s['Col1'], {}).get(s['Col2']), axis=1)

df is now df现在

   Col1 Col2  new_col
0     1    A     THIS
1     2    B      HAS
2     3    C       ME
3     4    D  STUMPED

A solution with loc (or at ) might be possible (and if so, will probably be faster).使用loc (或at )的解决方案可能是可能的(如果是这样,可能会更快)。 Need to look into that.需要研究一下。

The easiest option you have, in my opinion, is to create a new DataFrame using your nested dictionary and unstack this DataFrame, which you can then join with your original DataFrame, like so:在我看来,您拥有的最简单的选择是使用嵌套字典创建一个新的 DataFrame 并取消堆叠此 DataFrame,然后您可以将其与原始 DataFrame 连接,如下所示:

s = pd.DataFrame(dbl_dict).unstack().rename_axis(('Col1','Col2')).rename('new_column')
print (s)
df = df.join(s, on=['Col1','Col2'])
print (df)

I've created a tiny (2 line) custom function for you to use which seems to solve the case.我创建了一个很小的(2 行)自定义函数供您使用,它似乎可以解决这个问题。 Of course this can be improved to catch some errors and behaviors for specific cases.当然,这可以改进以捕获特定情况下的一些错误和行为。

import pandas as pd
data = {'col_1':[1,2,3,4],'col_2':['A','B','C','D']}
df = pd.DataFrame(data)
dbl_dict = {1: {'A': 'THIS', 'B': 'blah', 'C': 'blah', 'D': 'blah'},
            2: {'A': 'blah', 'B': 'HAS' , 'C': 'blah', 'D': 'blah'},
            3: {'A': 'blah', 'B': 'blah', 'C': 'ME'  , 'D': 'blah'},
            4: {'A': 'blah', 'B': 'blah', 'C': 'blah', 'D': 'STUMPED'},}
def maperino(dict_name,key_1,key_2):
    val = [dict_name[key_1[i]][key_2[i]] for i in range(len(key_1))]
    return val        
df['col_3'] = maperino(dbl_dict,df['col_1'],df['col_2'])
print(df)

Output:输出:

   col_1 col_2    col_3
0      1     A     THIS
1      2     B      HAS
2      3     C       ME
3      4     D  STUMPED

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用键作为新列从字典创建数据框? - create a dataframe from a dictionary using the keys as a new column? Pandas 将多键字典映射到数据框 - Pandas map multikey dictionary to dataframe 如何使用 Lambda 函数根据 Pandas 数据框中的字典值创建新列“NewId” - How to create a new column "NewId" base on dictionary value in pandas dataframe using Lambda function 使用for循环根据字典创建新的数据框 - create new dataframe according to a dictionary using for loop 使用字典映射在 dataframe 中创建新列 - Create new columns in dataframe using a dictionary mapping 如何比较数据框列中的字符串值和单元格中的值以基于多值字典创建新的数据框? - How to compare string value in dataframe column and value in cell to create new dataframe based on multi value dictionary? 使用基于现有列和字典的值创建新的数据框列? - Create new dataframe column with values based existing column AND on dictionary? 如何从现有列创建新列,其格式类似于字典熊猫数据框 - How to create a new column from existed column which is formatted like a dictionary pandas dataframe 使用字典键创建新列 - create new column using a dictionary key 如何使用 DataFrame 内的列中的数据创建新列? - How to create new columns using data from the column inside the DataFrame?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM