I have a datafarme
import pandas as pd
df = pd.DataFrame({"type": ["A" ,"A1" ,"A" ,"A1","B" ],
"group": ["g1", "g2","g2","g2","g1"]})
And i have a dictionary
dic ={"AlphaA": {"A": {"g1":"A_GRP1", "g2":"A_GRP2"},
"A1": {"g1":"A1_GRP1", "g2":"A1_GRP2"}},
"AlphaB": {"B": {"g1":"B_GRP1", "g2":"B_GRP2"}},
}
i have to create a column name "value", which will use the data frame and dictionary and get value assigned to it
Conditions to be applied:
Example of row one:
type is "A" hence refering dictionary key "AlphaA"
group is "g1
therefore :
dictt["AlphaA"]["A"]["g1"] #would be the answer
Required Output
final_df = pd.DataFrame({"type" : ["A" ,"A1" ,"A" ,"A1","B" ],
"group": ["g1", "g2","g2","g2","g1"],
"value": ["A_GRP1", "A1_GRP2", "A_GRP2",
"A1_GRP2", "B_GRP1"]})
I was able to achieve this using loops but its is taking lot of time,
hence looking for some speedy technique.
Assuming dic
the input dictionary, you can merge the dictionary values into a single dictionary (with help of ChainMap
), convert to DataFrame and unstack
to Series and merge
:
from collections import ChainMap
s = pd.DataFrame(dict(ChainMap(*dic.values()))).unstack()
# without ChainMap
# d = {k: v for d in dic.values() for k,v in d.items()}
# pd.DataFrame(d).unstack()
out = df.merge(s.rename('value'), left_on=['type', 'group'], right_index=True)
output:
type group value
0 A g1 A_GRP1
1 A1 g2 A1_GRP2
3 A1 g2 A1_GRP2
2 A g2 A_GRP2
4 B g1 B_GRP1
Use DataFrame.join
with Series created from dictionary by dict comprehension:
d1 = {(k1, k2): v2 for k, v in d.items() for k1, v1 in v.items() for k2, v2 in v1.items()}
df = df.join(pd.Series(d1).rename('value'), on=['type','group'])
print (df)
type group value
0 A g1 A_GRP1
1 A1 g2 A1_GRP2
2 A g2 A_GRP2
3 A1 g2 A1_GRP2
4 B g1 B_GRP1
You can remove the outer key of original dictionary and try apply on rows
d = {k:v for vs in d.values() for k, v in vs.items()}
df['value'] = (df.assign(value=df['type'].map(d))
.apply(lambda row: row['value'][row['group']], axis=1)
)
print(d)
{'A': {'g1': 'A_GRP1', 'g2': 'A_GRP2'}, 'A1': {'g1': 'A1_GRP1', 'g2': 'A1_GRP2'}, 'B': {'g1': 'B_GRP1', 'g2': 'B_GRP2'}}
print(df)
type group value
0 A g1 A_GRP1
1 A1 g2 A1_GRP2
2 A g2 A_GRP2
3 A1 g2 A1_GRP2
4 B g1 B_GRP1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.