简体   繁体   English

如何在与另一列对照时更改一列中的值?

[英]How to change values in a column while checking them against another column?

Here's an example of my data: 这是我的数据的示例:

     essay ns0_nns1 A_pred B_pred A_pred01 B_pred01
 1      1        1  0.558  0.370       NA       NA
 2      2        0  0.293  0.654       NA       NA
 3      3        0  0.545  0.849       NA       NA
 4      4        0  0.432  0.698       NA       NA
 5      5        1  0.651  0.404       NA       NA
 6      6        0  0.657  0.502       NA       NA
 7      7        1  0.884  0.658       NA       NA
 8      8        1  0.736  0.348       NA       NA
 9      9        0  0.532  0.791       NA       NA
 10    10        0  0.180  0.789       NA       NA

I need to go through and if A_pred is <= 0.5, then the corresponding row in A_pred01 should be assigned 0, else it should be assigned 1. 我需要检查一下,如果A_pred <= 0.5,则A_pred01中的对应行应分配为0,否则应分配为1。

I thought I could do this with a for loop, so I came up with: 我以为可以通过for循环来做到这一点,所以我想到了:

    for(i in dat$A_pred){
        if(i<=0.5){
            dat$A_pred01[i]=0
        } else {
            dat$A_pred01[i]=1}
     }

This didn't work though. 不过这没用。 I guess what I need to know is, can I somehow have a placeholder for A_pred01 that corresponds to i, and that's changing each A_pred01 value as it goes along in the for loop? 我想我需要知道的是,我能以某种方式为与i对应的A_pred01占位符,并且随着for循环中的变化而改变每个A_pred01值吗? I hope what I'm asking makes sense, thanks. 希望我的要求有意义,谢谢。

If you would like to fix the loop try changing the i counter into a numeric vector ( 1 2 3 4 5 ... ) instead of the values of the column. 如果要修复循环,请尝试将i计数器更改为数字矢量( 1 2 3 4 5 ... ),而不是列的值。 Your original code didn't work because i was a value like .558 . 您的原始代码无效,因为i的值是.558 So when you run dat$A_pred01[i] you were inputting the decimal in there. 因此,当您运行dat$A_pred01[i]您在其中输入了小数。 So it ran dat$A_pred01[0.558] which wasn't what you were expecting to do. 因此它运行了dat$A_pred01[0.558] ,这不是您期望的。

for(i in 1:nrow(dat)){
    if(dat$A_pred[i]<=0.5){
        dat$A_pred01[i]=0
    } else {
        dat$A_pred01[i]=1}
 }

Vectorized 向量化

You can also avoid the loop altogether with: 您还可以通过以下方式完全避免循环:

dat$A_pred01 <- as.integer(dat$A_pred > 0.5)

The expression dat$A_pred > 0.5 is a logical vector indicating if each element satisfies the condition ( TRUE FALSE FALSE ... ). 表达式dat$A_pred > 0.5是一个逻辑向量,指示每个元素是否满足条件( TRUE FALSE FALSE ... )。 We then coerce it to 1's and 0's with as.integer . 然后,使用as.integer将其强制为1和0。

#    essay ns0_nns1 A_pred B_pred A_pred01 B_pred01
# 1      1        1  0.558  0.370        1       NA
# 2      2        0  0.293  0.654        0       NA
# 3      3        0  0.545  0.849        1       NA
# 4      4        0  0.432  0.698        0       NA
# 5      5        1  0.651  0.404        1       NA
# 6      6        0  0.657  0.502        1       NA
# 7      7        1  0.884  0.658        1       NA
# 8      8        1  0.736  0.348        1       NA
# 9      9        0  0.532  0.791        1       NA
# 10    10        0  0.180  0.789        0       NA

data.table 数据表

As your data sets get larger you may want to include data.table into your workflow. 随着数据集变大,您可能希望将data.table包含在工作流中。 Here is the same operation with that syntax: 这是与该语法相同的操作:

library(data.table)
setDT(dat)[, A_pred01 := as.integer(dat$A_pred > 0.5)]

Bonus 奖金

Instead of as.integer(dat$A_pred > 0.5) try the shorter +(dat$A_pred > 0.5) . 代替as.integer(dat$A_pred > 0.5)尝试较短的+(dat$A_pred > 0.5)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将列值与另一列进行比较 - Compare column values against another column 如何检测表列中的值并将它们减去另一列? - How to detect values in table's column and subtract them to another column? 如何根据另一列中的条件更改数据框的某一列中的值? - How to change values in a column of a data frame based on conditions in another column? 如何创建标识行值更改的另一列的列? - How to create column that identifies another column where the row values change? 如何根据另一列中的重复更改列值 R - How to change column values based on duplication in another column R 如何根据R中另一列的值总和绘制一列的因子? - How to plot factors of one column against their total sum of values from another column in R? 根据另一列更改具有多个值的列 - Change column with multiple values based on another column 我如何 plot 列子集与另一列的平均值? - How do I plot an average of a column subset against another column? 如何根据数据框中的另一列计算一列的统计信息? - How to calculate stats for one column against another column in a data frame? 如何根据另一列中的值比较 dataframe 中单列中的两个因子,如果不匹配则删除它们 - How to compare two factors in a single column in a dataframe based on the values in another column and delete them if they don't match
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM