[英]Fill matrix with column values in R using colnames and rownames
I have a very large dataset, so I want to avoid loops. 我有一个非常大的数据集,所以我想避免循环。
I have three columns of data: 我有三列数据:
col1 = time presented as 10000, 10001, 10002, 10100, 10101, 10102, 10200, 10201, 10202, 10300, ... (total 18000 times)
col1 =时间表示为10000、10001、10002、10100、10101、10102、10200、10201、10202、10300,...(共18000次)
col2 = id number 1 2 3 4 ... (total 500 ids)
col2 = ID编号1 2 3 4 ...(总共500个ID)
col3 = reading associated with particular id at particular time.
col3 =在特定时间与特定ID相关联的读数。 0.1 0.5 0.6 0.7... Say this is called Data3
0.1 0.5 0.6 0.7 ...说这叫做Data3
10000 1 0.1
10000 1 0.1
10001 1 0.5
10001 1 0.5
10002 1 0.6
10002 1 0.6
10100 1 0.7
10100 1 0.7
10200 1 0.6 (NOTE - some random entries missing)
10200 1 0.6(注意-缺少一些随机条目)
I want to present this as a matrix (called DataMatrix), but there is missing data, so a simple reshape will not do. 我想将其表示为矩阵(称为DataMatrix),但是缺少数据,因此简单的重塑将不起作用。 I want to have the missing data as NA entries.
我想要缺少的数据作为NA条目。
DataMatrix is currently an NA matrix of 500 columns and 18000 rows, where the row names and column names are the times and ids respectively. DataMatrix当前是一个500列和18000行的NA矩阵,其中行名和列名分别是时间和ID。
1 2 3 4 ....
1 2 3 4 ....
10000 NA NA NA NA ....
10000 NA NA NA NA ....
10001 NA NA NA NA ....
10001不适用不适用不适用....
Is there a way I can get R to go through each row of Data3, completing DataMatrix with the reading Data3[,3] by placing it in the row and column of the matrix whose names relate to the Data3[,1] and Data3[,2]. 有没有办法让R遍历Data3的每一行,通过将Data3 [,3]读入名称与Data3 [,1]和Data3 [有关的矩阵的行和列中来完成DataMatrix ,2]。 But without loops.
但是没有循环。
Thanks to all you smart people out there. 感谢所有聪明的人。
Here is a solution with possible id values in 1:10 and times values in 1:20. 这是一个可能的id值为1:10,时间值为1:20的解决方案。 First, create data:
首先,创建数据:
mx <- matrix(c(sample(1:20, 5), sample(1:10, 5), sample(1:50, 5)), ncol=3, dimnames=list(NULL, c("time", "id", "reading")))
times <- 1:20
ids <- 1:10
mx
# time id reading
# [1,] 4 3 25
# [2,] 5 4 9
# [3,] 9 7 45
# [4,] 18 1 40
# [5,] 11 8 28
Now, use outer
to pass every possible combination of time/id to a look up function that returns the corresponding reading
value: 现在,使用
outer
将时间/ id的所有可能组合传递给查找函数,该函数返回相应的reading
值:
outer(times, ids,
function(x, y) {
mapply(function(x.sub, y.sub) {
val <- mx[mx[, 1] == x.sub & mx[, 2] == y.sub, 3]
if(length(val) == 0L) NA_integer_ else val
},
x, y)
} )
This produces the (hopefully) desired answer: 这将产生(希望)所需的答案:
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] NA NA NA NA NA NA NA NA NA NA
# [2,] NA NA NA NA NA NA NA NA NA NA
# [3,] NA NA NA NA NA NA NA NA NA NA
# [4,] NA NA 25 NA NA NA NA NA NA NA
# [5,] NA NA NA 9 NA NA NA NA NA NA
# [6,] NA NA NA NA NA NA NA NA NA NA
# [7,] NA NA NA NA NA NA NA NA NA NA
# [8,] NA NA NA NA NA NA NA NA NA NA
# [9,] NA NA NA NA NA NA 45 NA NA NA
# [10,] NA NA NA NA NA NA NA NA NA NA
# [11,] NA NA NA NA NA NA NA 28 NA NA
# [12,] NA NA NA NA NA NA NA NA NA NA
# [13,] NA NA NA NA NA NA NA NA NA NA
# [14,] NA NA NA NA NA NA NA NA NA NA
# [15,] NA NA NA NA NA NA NA NA NA NA
# [16,] NA NA NA NA NA NA NA NA NA NA
# [17,] NA NA NA NA NA NA NA NA NA NA
# [18,] 40 NA NA NA NA NA NA NA NA NA
# [19,] NA NA NA NA NA NA NA NA NA NA
# [20,] NA NA NA NA NA NA NA NA NA NA
If I understood you correctly: 如果我正确理解您的意见:
Data3 <- data.frame(col1=10000:10499,
col2=1:500,
col3=round(runif(500),1))
library(reshape2)
DataMatrix <- dcast(Data3, col1~col2, value.var="col3")
DataMatrix[1:5, 1:5]
# col1 1 2 3 4
# 1 10000 0.4 NA NA NA
# 2 10001 NA 0.6 NA NA
# 3 10002 NA NA 0.9 NA
# 4 10003 NA NA NA 0.5
# 5 10004 NA NA NA NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.