[英]How can i create conditional column referring present columns of a dataframe and dictionary without using loop in python?
I have a datafarme我有一个数据农场
import pandas as pd
df = pd.DataFrame({"type": ["A" ,"A1" ,"A" ,"A1","B" ],
"group": ["g1", "g2","g2","g2","g1"]})
And i have a dictionary我有一本字典
dic ={"AlphaA": {"A": {"g1":"A_GRP1", "g2":"A_GRP2"},
"A1": {"g1":"A1_GRP1", "g2":"A1_GRP2"}},
"AlphaB": {"B": {"g1":"B_GRP1", "g2":"B_GRP2"}},
}
i have to create a column name "value", which will use the data frame and dictionary and get value assigned to it我必须创建一个列名“值”,它将使用数据框和字典并获取分配给它的值
Conditions to be applied:申请条件:
Example of row one:第一行示例:
type is "A" hence refering dictionary key "AlphaA"类型是“A”,因此引用字典键“AlphaA”
group is "g1组是“g1
therefore :因此:
dictt["AlphaA"]["A"]["g1"] #would be the answer
Required Output所需输出
final_df = pd.DataFrame({"type" : ["A" ,"A1" ,"A" ,"A1","B" ],
"group": ["g1", "g2","g2","g2","g1"],
"value": ["A_GRP1", "A1_GRP2", "A_GRP2",
"A1_GRP2", "B_GRP1"]})
I was able to achieve this using loops but its is taking lot of time,我能够使用循环来实现这一点,但它需要很多时间,
hence looking for some speedy technique.因此寻找一些快速的技术。
Assuming dic
the input dictionary, you can merge the dictionary values into a single dictionary (with help of ChainMap
), convert to DataFrame and unstack
to Series and merge
:假设
dic
输入字典,您可以将字典值合并到单个字典中(在ChainMap
的帮助下),转换为 DataFrame 并取消unstack
到 Series 并merge
:
from collections import ChainMap
s = pd.DataFrame(dict(ChainMap(*dic.values()))).unstack()
# without ChainMap
# d = {k: v for d in dic.values() for k,v in d.items()}
# pd.DataFrame(d).unstack()
out = df.merge(s.rename('value'), left_on=['type', 'group'], right_index=True)
output:输出:
type group value
0 A g1 A_GRP1
1 A1 g2 A1_GRP2
3 A1 g2 A1_GRP2
2 A g2 A_GRP2
4 B g1 B_GRP1
Use DataFrame.join
with Series created from dictionary by dict comprehension:将
DataFrame.join
与通过字典理解从字典创建的 Series 一起使用:
d1 = {(k1, k2): v2 for k, v in d.items() for k1, v1 in v.items() for k2, v2 in v1.items()}
df = df.join(pd.Series(d1).rename('value'), on=['type','group'])
print (df)
type group value
0 A g1 A_GRP1
1 A1 g2 A1_GRP2
2 A g2 A_GRP2
3 A1 g2 A1_GRP2
4 B g1 B_GRP1
You can remove the outer key of original dictionary and try apply on rows您可以删除原始字典的外键并尝试应用于行
d = {k:v for vs in d.values() for k, v in vs.items()}
df['value'] = (df.assign(value=df['type'].map(d))
.apply(lambda row: row['value'][row['group']], axis=1)
)
print(d)
{'A': {'g1': 'A_GRP1', 'g2': 'A_GRP2'}, 'A1': {'g1': 'A1_GRP1', 'g2': 'A1_GRP2'}, 'B': {'g1': 'B_GRP1', 'g2': 'B_GRP2'}}
print(df)
type group value
0 A g1 A_GRP1
1 A1 g2 A1_GRP2
2 A g2 A_GRP2
3 A1 g2 A1_GRP2
4 B g1 B_GRP1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.