简体   繁体   English

R数据表:根据参考列为列分配值

[英]R data table: Assign a value to column based on reference column

I would like to assign a value into a column from a larger table, using another column as a reference. 我想使用另一个列作为参考将值分配给较大表中的列。

Eg data: 例如数据:

require(data.table)
dt <- data.table(N=c(1:5),GPa1=c(sample(0:5,5)),GPa2=c(sample(5:15,5)),
GPb1=c(sample(0:20,5)),GPb2=c(sample(0:10,5)),id=c("b","a","b","b","a"))

   N GPa1 GPa2 GPb1 GPb2 id
1: 1    4   10    7    0  b
2: 2    5   15   19    7  a
3: 3    1    5   20    5  b
4: 4    0   13    3    4  b
5: 5    3    7    8    1  a

The idea is to get new columns Val1 and Val2 . 这个想法是获得新列Val1Val2 Any GP column ending in 1 is eligible for Val1 and any ending in 2 is eligible for Val2 . 任何以1结尾的GP列都可以使用Val1 ,任何以2结尾的GP列都可以使用Val2 The value to be insterted into the column is determined by the id column, per row. 要插入该列的值由id列(每行)确定。

So you can see for Val1 , you'd draw on the GPb1 column, then GPa1 , GPb1 , GPb1 again and finally GPa1 . 这样就可以看到Val1 ,先绘制GPb1列,然后GPa1GPb1GPb1 ,最后GPa1

The final result would be; 最终结果将是:

   N GPa1 GPa2 GPb1 GPb2 id Val1 Val2
1: 1    4   10    7    0  b   7    0
2: 2    5   15   19    7  a   5   15
3: 3    1    5   20    5  b  20    5
4: 4    0   13    3    4  b   3    4
5: 5    3    7    8    1  a   3    7

I did achieve the answer but in quite a few lines after melting it etc, but i'm sure there must be an elegant way to do this in data.table . 我确实达到了答案,但是在融化后等了好几行之后,但是我确信必须在data.table有一种优雅的方法来做到这data.table I was initially frustrated by the fact paste0 doesn't work in data.table ; 最初,我对paste0data.table不起作用感到沮丧;

dt[1,paste0("GP",id,"1")]

but; 但;

# The following gives a vector that is correct for Val1 (and works for 2)
diag(as.matrix(dt[,.SD,.SDcols=dt[,paste0("GP",id,"1")]]))

# I think the answer lies in `set`, but i've not had any luck.
for (i in 1:nrow(dt)) set(dt, i=dt[i,.SD,.SDcols=dt[,paste0("GP",id,"2")]], j=i, value=0)

The data is quite ugly this way so perhaps it's better to just use the melt method. 这种方式的数据很难看,所以最好只使用熔解法。

dt[id == "a", c("Val1", "Val2") := .(GPa1, GPa2)]
dt[id == "b", c("Val1", "Val2") := .(GPb1, GPb2)]
#   N GPa1 GPa2 GPb1 GPb2 id Val1 Val2
#1: 1    2   13    5    8  b    5    8
#2: 2    3    8    7    2  a    3    8
#3: 3    5   11   19    1  b   19    1
#4: 4    4    5    6    9  b    6    9
#5: 5    1   15    1   10  a    1   15

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM