简体   繁体   English

如何更改R中几个虚拟变量的值?

[英]how to change the values of several dummy variables in R?

I have 167 dummy variables amongst other variables in my dataframe. 我的数据框中有167个虚拟变量以及其他变量。 For creating a newdata for prediction, I wish to change the value of the first dummy variable to 1 and replace the values of all other variables with zero. 为了创建用于预测的新数据,我希望将第一个虚拟变量的值更改为1,并将所有其他变量的值替换为零。 My dummy variables are called district_code2, district_code3, district_code4 and so on. 我的虚拟变量称为district_code2,district_code3,district_code4等。 So I want to fix district_code2 to take the value 1 and all the others to take the value 0. 所以我想将district_code2固定为值1,将所有其他固定为0。

I created these dummy variables using factor and added them to my data by using model.matrix as in 我使用factor创建了这些虚拟变量,并使用model.matrix将它们添加到我的数据中,如下所示:

 dummies = data.frame(model.matrix(~district_code, data=data_wht_81_09))
 #to get rid of the intercept
 dummies1<-dummies[,-1]

I need to have the dummies in my data because after I run the regression I donot wish to take into account the coeffiecients on all the dummies in my prediction. 我需要在数据中包含虚拟变量,因为运行回归后,我不希望在预测中考虑所有虚拟变量的系数。 I want to plot the fitted value with respect to one variable holding all others at their mean. 我想绘制一个变量的拟合值,使其他变量均值保持不变。 For district dummies this implies adding a constant to all the fitted values. 对于区虚拟变量,这意味着向所有拟合值添加一个常数。 Hence I want to put the values of all other dummy variables to 0. May be there is a more efficient way to do this.Below I am showing a sample of the object dummies. 因此,我想将所有其他虚拟变量的值设置为0。也许有一种更有效的方法来执行此操作。下面,我展示了对象虚拟对象的示例。

 dput(head(dummies1,4))
 structure(list(district_code2 = c(0, 0, 0, 0), district_code3 = c(0, 
 0, 0, 0), district_code4 = c(0, 0, 0, 0), district_code5 = c(0, 
 0, 0, 0), district_code6 = c(0, 0, 0, 0), district_code7 = c(0,0, 0, 0), 

I am only displaying the first 6 variables. 我只显示前6个变量。 How can I do this? 我怎样才能做到这一点? Many thanks in advance. 提前谢谢了。

There is rarely any need to manipulate dummy variables yourself (R does that behind the scenes when you use factors), but, if it is absolutely needed, you can simply identify columns whose name start with discrict_code , and change their value: other columns will be left as is. 很少需要自己操作伪变量(使用因子时,R会在后台进行操作),但是,如果绝对需要,您可以简单地标识名称以discrict_code开头的列并更改其值:其他列将保持原样。

d <- data.frame( 
  district_code2 = c(0, 0, 0, 0), 
  district_code3 = c(0, 0, 0, 0), 
  district_code4 = c(0, 0, 0, 0), 
  district_code5 = c(0, 0, 0, 0), 
  district_code6 = c(0, 0, 0, 0), 
  district_code7 = c(0,0, 0, 0), 
  x = 1:4
)
library(stringr)
d[,str_detect(names(d), "^district_code[0-9]+")] <- 0
d[,1] <- 1
d

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM