從data.frame中選擇大子集

Question

我有一個龐大的數據集：

library(gtools)
a<-permutations(2,20,v=c(0,1),repeats.allowed=TRUE)
a<-as.data.frame(a)

我有一個矩陣：

set.seed(123)
b<-replicate(5,sample(1:20,5, replace=T))
b<-t(b)

對於'a'每一行，我想選擇'b'每一列指定的列

為此，我運行以下命令：

for (i in 1:nrow(a)) sapply(1:N, function(y) a[i,c(as.vector(b[,y]))])

其結果是我想要一個矩陣或data.frame與來自選定的列'a'的每個行'a'

問題在於此過程非常緩慢。 我想知道是否有更快的方法。

上面的示例確實顯示了該過程的速度。 這是一個較小的示例：

 library(gtools)
 a<-permutations(2,5,v=c(0,1),repeats.allowed=TRUE)
 a<-as.data.frame(a)



 set.seed(123)
  b<-replicate(5,sample(1:5,5, replace=T))
  b<-t(b)

這是我想要的逐步說明：

1. select the i-th row in `'a'`
2. select the y-th column in `'b'`

3.select those elements in the first row of `'a'` that are specified by the first column in `'b'`

4. Repeat 2. and 3. until all columns of 'b' have been used.

使用以下命令完成此操作：

sapply(1:N, function(y) a[i,c(as.vector(b[,y]))])

對'a'每一行重復1-4

這可以通過添加for循環來完成：

for (i in 1:nrow(a)) sapply(1:ncol(b), function(y) a[i,c(as.vector(b[,y]))])

Answer 1

使用一個較小的子a

 a1 <- a[1:22,]
 a2 <- as.matrix(a1[,c(b)])

 res1 <- lapply(split(a2, row(a2)), function(x) { matrix(x,ncol=ncol(b))})

或將其保留在數組中

 arr1 <- array(t(a2), dim=c(5,5,22))

res1[[22]]
#      [,1] [,2] [,3] [,4] [,5]
#[1,]    0    1    0    1    0
#[2,]    0    0    1    0    0
#[3,]    1    0    0    0    0
#[4,]    1    0    0    0    1
#[5,]    1    0    0    1    0

arr1[,,22]
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    0    1    0    1    0
# [2,]    0    0    1    0    0
# [3,]    1    0    0    0    0
# [4,]    1    0    0    0    1
# [5,]    1    0    0    1    0

從data.frame中選擇大子集

問題描述

1 個解決方案

解決方案1
0 已采納 2014-08-20 09:04:40

從data.frame中選擇大子集

問題描述

1 個解決方案

解決方案1 0 已采納 2014-08-20 09:04:40

解決方案1
0 已采納 2014-08-20 09:04:40