简体   繁体   中英

Creating a column variable taking the mean of a variable conditional on two other variables

I have a data frame that shows the mean 'dwdime' for each of the given conditions:

DIMExCand_means = DIMExCand.groupby(['cycle', 'coded_state', 'party.orig', 'comtype']).mean()

I have created a pivot table from DIMExCand_means with the following command and output:

DIMExCand_master = pd.pivot_table(DIMExCand_means,index=["Cycle","State"])

However, some data gets lost in the process. I would like to add columns to the 'DIMExCand_master' dataframe that includes the mean 'dwdime' score given each possible combination of 'party.orig' and 'comptype' , as this will allow me to have one entry per 'cycle'-'coded_state' .

Let's try:

DIMExCand_means = DIMExCand_means.reset_index()
DIMExCand_master = DIMExCand_master.reset_index()

pd.merge(DIMExCand_means, DIMExCand_master, left_on=['cycle','coded_state'], right_on=['Cycle','State'])

Thanks!

I ended up going with:

DIMExCand_dime = pd.pivot_table(DIMExCand, values = 'dwdime', index ["Cycle","State"], columns='ID', aggfunc=np.mean)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM