如何在不使用 python 循环的情况下创建引用数据框和字典的当前列的条件列？

Question

I have a datafarme我有一个数据农场

import pandas as pd

df = pd.DataFrame({"type":  ["A" ,"A1" ,"A" ,"A1","B" ],
                  "group":  ["g1", "g2","g2","g2","g1"]})

And i have a dictionary我有一本字典

 dic ={"AlphaA": {"A":  {"g1":"A_GRP1",  "g2":"A_GRP2"},
                  "A1": {"g1":"A1_GRP1", "g2":"A1_GRP2"}},
       "AlphaB": {"B":  {"g1":"B_GRP1",  "g2":"B_GRP2"}},
      }

i have to create a column name "value", which will use the data frame and dictionary and get value assigned to it我必须创建一个列名“值”，它将使用数据框和字典并获取分配给它的值

Conditions to be applied:申请条件：

if type is "A" or "A1" it should refer dictionary key AlphaA and get the value for respective group and assign it to new column如果类型是“A”或“A1”，它应该引用字典键 AlphaA 并获取相应组的值并将其分配给新列
if type is "B", it should refer dictionary key AlphaB and get the value of the respective group如果类型是“B”，它应该引用字典键 AlphaB 并获取相应组的值

Example of row one:第一行示例：
type is "A" hence refering dictionary key "AlphaA"类型是“A”，因此引用字典键“AlphaA”
group is "g1组是“g1
therefore :因此：

dictt["AlphaA"]["A"]["g1"]          #would be the answer

Required Output所需输出

 final_df = pd.DataFrame({"type" :  ["A" ,"A1" ,"A" ,"A1","B" ],
                          "group":  ["g1", "g2","g2","g2","g1"],
                          "value":  ["A_GRP1", "A1_GRP2", "A_GRP2",
                                     "A1_GRP2", "B_GRP1"]})

I was able to achieve this using loops but its is taking lot of time,我能够使用循环来实现这一点，但它需要很多时间，
hence looking for some speedy technique.因此寻找一些快速的技术。

Answer 1

Assuming dic the input dictionary, you can merge the dictionary values into a single dictionary (with help of ChainMap ), convert to DataFrame and unstack to Series and merge :假设dic输入字典，您可以将字典值合并到单个字典中（在ChainMap的帮助下），转换为 DataFrame 并取消unstack到 Series 并merge ：

from collections import ChainMap
s = pd.DataFrame(dict(ChainMap(*dic.values()))).unstack()

# without ChainMap
# d = {k: v for d in dic.values() for k,v in d.items()}
# pd.DataFrame(d).unstack()

out = df.merge(s.rename('value'), left_on=['type', 'group'], right_index=True)

output:输出：

  type group    value
0    A    g1   A_GRP1
1   A1    g2  A1_GRP2
3   A1    g2  A1_GRP2
2    A    g2   A_GRP2
4    B    g1   B_GRP1

Answer 2

Use DataFrame.join with Series created from dictionary by dict comprehension:将DataFrame.join与通过字典理解从字典创建的 Series 一起使用：

d1 = {(k1, k2): v2 for k, v in d.items() for k1, v1 in v.items() for k2, v2 in v1.items()}
df = df.join(pd.Series(d1).rename('value'), on=['type','group'])
print (df)
  type group    value
0    A    g1   A_GRP1
1   A1    g2  A1_GRP2
2    A    g2   A_GRP2
3   A1    g2  A1_GRP2
4    B    g1   B_GRP1

Answer 3

You can remove the outer key of original dictionary and try apply on rows您可以删除原始字典的外键并尝试应用于行

d = {k:v for vs in d.values() for k, v in vs.items()}
df['value'] = (df.assign(value=df['type'].map(d))
               .apply(lambda row: row['value'][row['group']], axis=1)
               )

print(d)

{'A': {'g1': 'A_GRP1', 'g2': 'A_GRP2'}, 'A1': {'g1': 'A1_GRP1', 'g2': 'A1_GRP2'}, 'B': {'g1': 'B_GRP1', 'g2': 'B_GRP2'}}

print(df)

  type group    value
0    A    g1   A_GRP1
1   A1    g2  A1_GRP2
2    A    g2   A_GRP2
3   A1    g2  A1_GRP2
4    B    g1   B_GRP1

如何在不使用 python 循环的情况下创建引用数据框和字典的当前列的条件列？

问题描述

3 个解决方案

解决方案1
3 已采纳 2022-05-31 09:35:57

解决方案2
0 2022-05-31 09:35:39

解决方案3
0 2022-05-31 09:41:49

如何在不使用 python 循环的情况下创建引用数据框和字典的当前列的条件列？

问题描述

3 个解决方案

解决方案1 3 已采纳 2022-05-31 09:35:57

解决方案2 0 2022-05-31 09:35:39

解决方案3 0 2022-05-31 09:41:49

解决方案1
3 已采纳 2022-05-31 09:35:57

解决方案2
0 2022-05-31 09:35:39

解决方案3
0 2022-05-31 09:41:49