[英]R: ddply only returns first column of matrix column
I have a data.frame D
where some columns are matrices, eg 我有一个data.frame
D
,其中某些列是矩阵,例如
> head(round(D$equationRTs, 1))
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 2.9 2.1 3.2 3.5 NA NA NA
[2,] 2.8 NA NA NA NA NA NA
[3,] 3.4 2.4 NA NA NA NA NA
[4,] 2.7 2.9 1.9 NA NA NA NA
[5,] 3.6 2.6 2.4 2.4 3.4 2.8 NA
[6,] 2.4 2.0 3.3 2.8 2.8 2.6 3.6
...
> dim(D$equationRTs)
[1] 11277 7
> typeof(D$equationRTs)
[1] "double"
However, when I do a ddply
to subset D
: 但是,当我
ddply
集D
进行ddply
:
my_function = function(df) {
# Let's see what ddply passes to this function:
print(head(round(df$equationRTs, 1)))
print(dim(df$equationRTs))
print(typeof(df$equationRTs))
}
D = ddply(D, .(id), my_function)
it appears that only the first column is passed to my_function
as vector: 似乎只有第一列作为向量传递给
my_function
:
[1] 2.9 2.8 3.4 2.7 3.6 2.4
NULL
[1] "double"
Column 2-6 are just gone. 第2-6栏就不见了。 What's going on here and how do I make the matrix column stay intact when subsets are passed to
my_function
? 这是怎么回事?当将子集传递给
my_function
时,如何使矩阵列保持原样?
Bonus: it seems that ddply is doing something like D$equationRTs[id==x]
which indeed returns the first column of the matrix whereas D$equationRTs[id==x, ]
returns the matrix. 奖励:似乎ddply正在执行类似
D$equationRTs[id==x]
,该操作确实返回矩阵的第一列,而D$equationRTs[id==x, ]
返回矩阵。
You can simply manually code what ddply
does (or should do...). 您可以简单地手动编码
ddply
工作(或应该做的...)。
So replace the old 所以取代旧
D = ddply(D, .(id), my_function)
with 与
for(id in levels(D$id)) {
D[D$id == id, ] = check_quality(D[D$id == id, ])
}
The manual version has the desired behavior. 手动版本具有所需的行为。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.