简体   繁体   中英

merging pandas dataframe on multyindex columns

I was tring to apply a function on my dataframe adding information regarding the average performance of each id i have. the problem is that applying this function using apply takes a lot of time (i stopped the process after 20 minuts) and i need to do it a few times over. so i thought to try taking out the information to a new data frame and then merge the two together. The problem is my new dataframe has multyindex and i'm trying to merge on that index. can't seem to find the way to do it. any suggestions?

the new data frame look like this:

                              compliance_detail
    inspector_name  compliance  
    Addison, Michael    0.0 0.431203
                        1.0 0.034050
    Ahmad, Muna I       0.0 0.731405
                        1.0 0.052342
    Anderson, Trevis    0.0 0.166667

and i want to merge on the inspector's name and compliance since i have this two in my original data frame.

i want the merged data frame to have for every inspector his rating for complaince==0 and complaince==1. so it will look something like this:

    ticket_id   inspector_name  positive_complaince negetive_complaince
    0   22056   Addison, Michael 0.034050            0.431203
    1   22057   Ahmad, Muna I    0.052342            0.731405

You can use unstack to reshape the data frame. Also,based on your given output, I assume you want to to drop inspector_name where one of the compliance (positive or negative) is not available.

## sample data
df = pd.DataFrame({'inspector_name':['Addison, Michael','Addison, Michael','Ahmad, Muna I','Ahmad, Muna I','Anderson, Trevis'],
                   'compliance': [0.0,1.0,0.0,1.0,0.0],
                   'compliance_detail':[0.4312, 0.0340, 0.7314,0.052,0.1666]})

df = df.set_index(['inspector_name','compliance'])

## solution

df = df.unstack().dropna().reset_index()
df.columns = ['inspector_name','negative_complaince','positive_complaince']
df

    inspector_name  negative_complaince     positive_complaince
0   Addison, Michael       0.4312                  0.034
1   Ahmad, Muna I          0.7314                  0.052

ok i found the answer. not sure if its the best way but it works. first i take the new data frame and split it into two data frames, one with compliance==0 and the other with compliance==1. then i take the index of each data frame convert it into a list and save it in a parameter. now i take my basic data frame and merge it with the two new dataframes i created. the "right_on" parameter in the merge function is now equal to the name of the list saved on the side for the two dataframes. and it works :) hope this could help someone.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM