简体   繁体   中英

How can i create conditional column referring present columns of a dataframe and dictionary without using loop in python?

I have a datafarme

import pandas as pd

df = pd.DataFrame({"type":  ["A" ,"A1" ,"A" ,"A1","B" ],
                  "group":  ["g1", "g2","g2","g2","g1"]})

And i have a dictionary

 dic ={"AlphaA": {"A":  {"g1":"A_GRP1",  "g2":"A_GRP2"},
                  "A1": {"g1":"A1_GRP1", "g2":"A1_GRP2"}},
       "AlphaB": {"B":  {"g1":"B_GRP1",  "g2":"B_GRP2"}},
      }

i have to create a column name "value", which will use the data frame and dictionary and get value assigned to it

Conditions to be applied:

  1. if type is "A" or "A1" it should refer dictionary key AlphaA and get the value for respective group and assign it to new column
  2. if type is "B", it should refer dictionary key AlphaB and get the value of the respective group

Example of row one:
type is "A" hence refering dictionary key "AlphaA"
group is "g1
therefore :

dictt["AlphaA"]["A"]["g1"]          #would be the answer  

Required Output

 final_df = pd.DataFrame({"type" :  ["A" ,"A1" ,"A" ,"A1","B" ],
                          "group":  ["g1", "g2","g2","g2","g1"],
                          "value":  ["A_GRP1", "A1_GRP2", "A_GRP2",
                                     "A1_GRP2", "B_GRP1"]})

I was able to achieve this using loops but its is taking lot of time,
hence looking for some speedy technique.

Assuming dic the input dictionary, you can merge the dictionary values into a single dictionary (with help of ChainMap ), convert to DataFrame and unstack to Series and merge :

from collections import ChainMap
s = pd.DataFrame(dict(ChainMap(*dic.values()))).unstack()

# without ChainMap
# d = {k: v for d in dic.values() for k,v in d.items()}
# pd.DataFrame(d).unstack()

out = df.merge(s.rename('value'), left_on=['type', 'group'], right_index=True)

output:

  type group    value
0    A    g1   A_GRP1
1   A1    g2  A1_GRP2
3   A1    g2  A1_GRP2
2    A    g2   A_GRP2
4    B    g1   B_GRP1

Use DataFrame.join with Series created from dictionary by dict comprehension:

d1 = {(k1, k2): v2 for k, v in d.items() for k1, v1 in v.items() for k2, v2 in v1.items()}
df = df.join(pd.Series(d1).rename('value'), on=['type','group'])
print (df)
  type group    value
0    A    g1   A_GRP1
1   A1    g2  A1_GRP2
2    A    g2   A_GRP2
3   A1    g2  A1_GRP2
4    B    g1   B_GRP1

You can remove the outer key of original dictionary and try apply on rows

d = {k:v for vs in d.values() for k, v in vs.items()}
df['value'] = (df.assign(value=df['type'].map(d))
               .apply(lambda row: row['value'][row['group']], axis=1)
               )
print(d)

{'A': {'g1': 'A_GRP1', 'g2': 'A_GRP2'}, 'A1': {'g1': 'A1_GRP1', 'g2': 'A1_GRP2'}, 'B': {'g1': 'B_GRP1', 'g2': 'B_GRP2'}}

print(df)

  type group    value
0    A    g1   A_GRP1
1   A1    g2  A1_GRP2
2    A    g2   A_GRP2
3   A1    g2  A1_GRP2
4    B    g1   B_GRP1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM