简体   繁体   English

创建一个以其他两个条件为条件的变量均值的列变量

[英]Creating a column variable taking the mean of a variable conditional on two other variables

I have a data frame that shows the mean 'dwdime' for each of the given conditions: 我有一个数据框,显示每个给定条件的均值'dwdime'

DIMExCand_means = DIMExCand.groupby(['cycle', 'coded_state', 'party.orig', 'comtype']).mean()

I have created a pivot table from DIMExCand_means with the following command and output: 我使用以下命令和输出从DIMExCand_means创建了数据透视表:

DIMExCand_master = pd.pivot_table(DIMExCand_means,index=["Cycle","State"])

However, some data gets lost in the process. 但是,在此过程中一些数据会丢失。 I would like to add columns to the 'DIMExCand_master' dataframe that includes the mean 'dwdime' score given each possible combination of 'party.orig' and 'comptype' , as this will allow me to have one entry per 'cycle'-'coded_state' . 我想在'DIMExCand_master'数据'DIMExCand_master'添加列,其中包括给定'party.orig''comptype'每种可能组合的平均'dwdime'得分,因为这将使我每个'cycle'-'coded_state'有一个条目'cycle'-'coded_state'

Let's try: 我们试试吧:

DIMExCand_means = DIMExCand_means.reset_index()
DIMExCand_master = DIMExCand_master.reset_index()

pd.merge(DIMExCand_means, DIMExCand_master, left_on=['cycle','coded_state'], right_on=['Cycle','State'])

Thanks! 谢谢!

I ended up going with: 我最终选择了:

DIMExCand_dime = pd.pivot_table(DIMExCand, values = 'dwdime', index ["Cycle","State"], columns='ID', aggfunc=np.mean) DIMExCand_dime = pd.pivot_table(DIMExCand,值='dwdime',索引[“ Cycle”,“ State”],列='ID',aggfunc = np.mean)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据其他两个变量的值创建一个变量 - Creating a variable based on the values of two other variables 在数据框中创建一个均值列,该列依赖于 Pandas 中数据框的其他变量 - Creating a mean column in a dataframe dependent on other variables of the dataframe in pandas 从两个虚拟变量创建分类变量 - Creating a categorical variable from two dummy variables 基于其他两个变量创建变量 - Create variable based on two other variables 在其他两个变量的条件下创建一个新变量 - Create a new variable on condition of two other variables 以另外两列pandas为条件创建一个新列 - Creating a new column on conditional of two other columns pandas 基于两列创建新变量作为索引一列作为新变量名称python pandas或R. - Creating new variables based on two columns as index one column as new variable names python pandas or R 如何根据两个或多个其他变量创建pandas dataframe变量/列? - How to create pandas dataframe variable/column based on two or more other variables? 将因变量可视化为其他两个自变量的 function 的最佳方法,每个变量都是数据农场的一列? - Best way to visualize a dependent variable as a function of two other independent variables, each of them is a column of a datafarme? 在numpy中按变量分组的行中取平均值 - Taking mean across rows grouped by a variable in numpy
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM