[英]Creating a column variable taking the mean of a variable conditional on two other variables
I have a data frame that shows the mean 'dwdime'
for each of the given conditions: 我有一个数据框,显示每个给定条件的均值
'dwdime'
:
DIMExCand_means = DIMExCand.groupby(['cycle', 'coded_state', 'party.orig', 'comtype']).mean()
I have created a pivot table from DIMExCand_means with the following command and output: 我使用以下命令和输出从DIMExCand_means创建了数据透视表:
DIMExCand_master = pd.pivot_table(DIMExCand_means,index=["Cycle","State"])
However, some data gets lost in the process. 但是,在此过程中一些数据会丢失。 I would like to add columns to the
'DIMExCand_master'
dataframe that includes the mean 'dwdime'
score given each possible combination of 'party.orig'
and 'comptype'
, as this will allow me to have one entry per 'cycle'-'coded_state'
. 我想在
'DIMExCand_master'
数据'DIMExCand_master'
添加列,其中包括给定'party.orig'
和'comptype'
每种可能组合的平均'dwdime'
得分,因为这将使我每个'cycle'-'coded_state'
有一个条目'cycle'-'coded_state'
。
Let's try: 我们试试吧:
DIMExCand_means = DIMExCand_means.reset_index()
DIMExCand_master = DIMExCand_master.reset_index()
pd.merge(DIMExCand_means, DIMExCand_master, left_on=['cycle','coded_state'], right_on=['Cycle','State'])
Thanks! 谢谢!
I ended up going with: 我最终选择了:
DIMExCand_dime = pd.pivot_table(DIMExCand, values = 'dwdime', index ["Cycle","State"], columns='ID', aggfunc=np.mean) DIMExCand_dime = pd.pivot_table(DIMExCand,值='dwdime',索引[“ Cycle”,“ State”],列='ID',aggfunc = np.mean)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.