如何在与另一列对照时更改一列中的值？

Question

Here's an example of my data: 这是我的数据的示例：

     essay ns0_nns1 A_pred B_pred A_pred01 B_pred01
 1      1        1  0.558  0.370       NA       NA
 2      2        0  0.293  0.654       NA       NA
 3      3        0  0.545  0.849       NA       NA
 4      4        0  0.432  0.698       NA       NA
 5      5        1  0.651  0.404       NA       NA
 6      6        0  0.657  0.502       NA       NA
 7      7        1  0.884  0.658       NA       NA
 8      8        1  0.736  0.348       NA       NA
 9      9        0  0.532  0.791       NA       NA
 10    10        0  0.180  0.789       NA       NA

I need to go through and if A_pred is <= 0.5, then the corresponding row in A_pred01 should be assigned 0, else it should be assigned 1. 我需要检查一下，如果A_pred <= 0.5，则A_pred01中的对应行应分配为0，否则应分配为1。

I thought I could do this with a for loop, so I came up with: 我以为可以通过for循环来做到这一点，所以我想到了：

    for(i in dat$A_pred){
        if(i<=0.5){
            dat$A_pred01[i]=0
        } else {
            dat$A_pred01[i]=1}
     }

This didn't work though. 不过这没用。 I guess what I need to know is, can I somehow have a placeholder for A_pred01 that corresponds to i, and that's changing each A_pred01 value as it goes along in the for loop? 我想我需要知道的是，我能以某种方式为与i对应的A_pred01占位符，并且随着for循环中的变化而改变每个A_pred01值吗？ I hope what I'm asking makes sense, thanks. 希望我的要求有意义，谢谢。

Answer 1

If you would like to fix the loop try changing the i counter into a numeric vector ( 1 2 3 4 5 ... ) instead of the values of the column. 如果要修复循环，请尝试将i计数器更改为数字矢量（ 1 2 3 4 5 ... ），而不是列的值。 Your original code didn't work because i was a value like .558 . 您的原始代码无效，因为i的值是.558 。 So when you run dat$A_pred01[i] you were inputting the decimal in there. 因此，当您运行dat$A_pred01[i]您在其中输入了小数。 So it ran dat$A_pred01[0.558] which wasn't what you were expecting to do. 因此它运行了dat$A_pred01[0.558] ，这不是您期望的。

for(i in 1:nrow(dat)){
    if(dat$A_pred[i]<=0.5){
        dat$A_pred01[i]=0
    } else {
        dat$A_pred01[i]=1}
 }

Vectorized 向量化

You can also avoid the loop altogether with: 您还可以通过以下方式完全避免循环：

dat$A_pred01 <- as.integer(dat$A_pred > 0.5)

The expression dat$A_pred > 0.5 is a logical vector indicating if each element satisfies the condition ( TRUE FALSE FALSE ... ). 表达式dat$A_pred > 0.5是一个逻辑向量，指示每个元素是否满足条件（ TRUE FALSE FALSE ... ）。 We then coerce it to 1's and 0's with as.integer . 然后，使用as.integer将其强制为1和0。

#    essay ns0_nns1 A_pred B_pred A_pred01 B_pred01
# 1      1        1  0.558  0.370        1       NA
# 2      2        0  0.293  0.654        0       NA
# 3      3        0  0.545  0.849        1       NA
# 4      4        0  0.432  0.698        0       NA
# 5      5        1  0.651  0.404        1       NA
# 6      6        0  0.657  0.502        1       NA
# 7      7        1  0.884  0.658        1       NA
# 8      8        1  0.736  0.348        1       NA
# 9      9        0  0.532  0.791        1       NA
# 10    10        0  0.180  0.789        0       NA

data.table 数据表

As your data sets get larger you may want to include data.table into your workflow. 随着数据集变大，您可能希望将data.table包含在工作流中。 Here is the same operation with that syntax: 这是与该语法相同的操作：

library(data.table)
setDT(dat)[, A_pred01 := as.integer(dat$A_pred > 0.5)]

Bonus 奖金

Instead of as.integer(dat$A_pred > 0.5) try the shorter +(dat$A_pred > 0.5) . 代替as.integer(dat$A_pred > 0.5)尝试较短的+(dat$A_pred > 0.5) 。

如何在与另一列对照时更改一列中的值？

问题描述

1 个解决方案

解决方案1
2 2015-09-20 04:58:33

如何在与另一列对照时更改一列中的值？

问题描述

1 个解决方案

解决方案1 2 2015-09-20 04:58:33

解决方案1
2 2015-09-20 04:58:33