繁体   English   中英

R:如何从for循环而不是索引输出因子级别?

[英]R: how do I output the factor level from a for loop rather than the index?

我有一个数据框,我正在使用for循环运行蒙特卡罗模拟,以生成模拟分布。 当我测试模拟代码时,我只是访问数据框中的第一个观察:

Male.MC <-c()
for (j in 1:100){
    for (i in 1:1)  {
        # u2 <- Male.DistF$Male.stddev_u2[i] * rnorm(1, mean = 0, sd = 1)
        u2 <- Male.DistF$RndmEffct[i] * rnorm(1, mean = 0, sd = 1)
        mc_bca <- Male.DistF$lmefits[i] + u2
        temp <- Lambda.Value*mc_bca+1
        ginv_a <- temp^(1/Lambda.Value)
        d2ginv_a <- max(0,(1-Lambda.Value)*temp^(1/Lambda.Value-2))
        mc_amount <- ginv_a + d2ginv_a * Male.DistF$Male.var[i]^2 / 2
        z <- c(RespondentID <- Male.DistF$RespondentID[i], 
                   Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i], 
        Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount) 
        Male.MC <- as.data.frame(rbind(Male.MC,z))
    }
}
colnames(Male.MC) <- c("RespondentID", "AgeFactor", 
                       "SampleWeight", "VarByAge", 
                       "lmefits", "u2", "mc_amount")

代码工作得非常漂亮,除了Male.DistF$RespondentID是一个因素,我没有得到因子水平输出,而是获得因子索引,在这种情况下,我得到1因为RespondentID s在Male.DistF中按升序排列Male.DistF数据框。 我对AgeFactor有同样的问题,我得到索引而不是因子级别。

head(Male.MC)
  RespondentID AgeFactor SampleWeight  VarByAge  lmefits         u2 mc_amount
z            1         3    0.4952835 0.4189871 15.22634  0.2334501 11582.681
2            1         3    0.4952835 0.4189871 15.22634  0.3205741 11984.220
3            1         3    0.4952835 0.4189871 15.22634 -0.5674165  8420.678
4            1         3    0.4952835 0.4189871 15.22634 -0.5426489  8505.421
5            1         3    0.4952835 0.4189871 15.22634  0.4878695 12790.565
6            1         3    0.4952835 0.4189871 15.22634  0.1556925 11234.583

如何使`Male.MC1数据框包含这两个变量的因子水平? 我试过了:

z <- c(RespondentID <- as.character(Male.DistF$RespondentID[i]), 
       Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i], 
       Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount)

z <- c((as.character(Male.DistF$RespondentID[i])), 
       Male.DistF$AgeFactor[i], Male.DistF$SampleWeight[i], 
       Male.DistF$Male.var[i], Male.DistF$lmefits[i], u2, mc_amount)

修复RespondentID输出,但是我对该语法做错了,并且它试图将所有输出转换为因子:

There were 50 or more warnings (use warnings() to see the first 50)
str(Male.MC)
'data.frame':   100 obs. of  7 variables:
$ RespondentID: Factor w/ 1 level "100020": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ AgeFactor   : Factor w/ 1 level "3": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ SampleWeight: Factor w/ 1 level "0.495283471": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ VarByAge    : Factor w/ 1 level "0.418987052181831": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ lmefits     : Factor w/ 1 level "15.2263403968895": 1 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ u2          : Factor w/ 1 level "-0.100954008424162": 1 NA NA NA NA NA NA NA NA NA ...
..- attr(*, "names")= chr  "z" "" "" "" ...
$ mc_amount   : Factor w/ 1 level "10151.4582133747": 1 NA NA NA NA NA NA NA NA NA ...
..- attr(*, "names")= chr  "z" "" "" "" ...

为了测试,这里是输入数据帧Male.DistF的前几行:

     AgeFactor RespondentID SampleWeight IntakeAmt   RndmEffct NutrientID Gender Age BodyWeight  IntakeDay BoxCoxXY  lmefits      lmeres   TotWts   GrpWts NumSubjects TotSubjects  Male.var
1725     9to13       100020    0.4952835 12145.852  0.30288536        267      1  12       51.6 Day1Intake 15.61196 15.22634  0.27138449 2291.827 763.0604         525        2249 0.4189871
203     14to18       100419    0.3632839  9591.953  0.02703093        267      1  14       46.3 Day1Intake 15.01444 15.31373 -0.18039624 2291.827 472.3106         561        2249 0.3365423

Lambda.Value0.1 关于Male.DistF的信息是:

str(Male.DistF)
'data.frame':   2249 obs. of  18 variables:
$ AgeFactor   : Ord.factor w/ 4 levels "1to3"<"4to8"<..: 3 4 3 4 2 2 3 1 1 3 ...
$ RespondentID: Factor w/ 2249 levels "100020","100419",..: 1 2 3 4 5 6 7 8 9 10 ...
$ SampleWeight: num  0.495 0.363 0.495 1.326 2.12 ...
$ IntakeAmt   : num  12146 9592 7839 11113 7150 ...
$ RndmEffct   : num  0.3029 0.027 0.0772 0.4667 -0.1593 ...
$ NutrientID  : int  267 267 267 267 267 267 267 267 267 267 ...
$ Gender      : int  1 1 1 1 1 1 1 1 1 1 ...
$ Age         : int  12 14 11 15 6 5 10 2 2 9 ...
$ BodyWeight  : num  51.6 46.3 46.1 63.2 28.4 18 38.2 14.4 14.6 32.1 ...
$ IntakeDay   : Factor w/ 2 levels "Day1Intake","Day2Intake": 1 1 1 1 1 1 1 1 1 1 ...
$ BoxCoxXY    : num  15.6 15 14.5 15.4 14.3 ...
$ lmefits     : num  15.2 15.3 15 15.8 14.3 ...
$ lmeres      : num  0.271 -0.18 -0.342 -0.424 -0.053 ...
$ TotWts      : num  2292 2292 2292 2292 2292 ...
$ GrpWts      : num  763 472 763 472 779 ...
$ NumSubjects : int  525 561 525 561 613 613 525 550 550 525 ...
$ TotSubjects : int  2249 2249 2249 2249 2249 2249 2249 2249 2249 2249 ...
$ Male.var    : num  0.419 0.337 0.419 0.337 0.267 ...

从我的Male.DistF数据中可以看出,对于第一次观察的100次重复,在Male.MC数据框中,我希望100020作为RespondentID (而不是1 ), 9to13作为AgeFactor (而不是3 )。 我的输出指令出了什么问题,如何解决这个问题? 特别是,我不as.character为什么我尝试使用as.character误入歧途,影响整个输出。 另外,我也欢迎加快循环的建议。 我所做的就是在Male.DistF数据框中为每个观察值构建100组值。

你可以尝试更换线路

z <- c(...

它将新行创建为向量,即强制所有元素具有相同的类型,具有1行data.frame,以保持列的类型。

z <- data.frame(
  RespondentID = Male.DistF$RespondentID[i], 
  AgeFactor    = Male.DistF$AgeFactor[i], 
  SampleWeight = Male.DistF$SampleWeight[i], 
  VarByAge     = Male.DistF$Male.var[i], 
  lmefits      = Male.DistF$lmefits[i], 
  u2           = u2, 
  mc_amount    = mc_amount
)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM