简体   繁体   English

R子集数据帧的两个向量

[英]R subset data frame by two vectors

I have a data frame and two integer vectors named left and right . 我有一个数据帧和两个分别称为leftright整数向量。

I want to create a subset of the data frame in such a way that that the numbers in the vectors will indicate the range of the columns to be included in the subset. 我想以这种方式创建数据帧的子集,以使向量中的数字将指示要包括在该子集中的列的范围。

For example, for the nth row in the data frame, I want to keep the values df[n,left[n]:right[n] . 例如,对于数据帧中的第n行,我想保留值df[n,left[n]:right[n] I tried doing so using mapply() : 我尝试使用mapply()这样做:

aligned_rows<-apply(df,1,
                function(x) mapply(function(y,z)x[y:z], left, right))

But got an output that didn't make any sense. 但是得到的输出没有任何意义。

This command does the trick: 此命令可以解决问题:

as.data.frame(t(mapply(function(x,y,z) df[x,y:z],
                       x=seq_len(nrow(df)),y=left,z=right)))

Here is an example: 这是一个例子:

set.seed(10)
df <- data.frame(replicate(8,runif(4)))

#    X1    X2    X3    X4    X5    X6    X7    X8
#  0.51  0.09  0.62  0.11  0.05  0.86  0.41  0.77
#  0.31  0.23  0.43  0.60  0.26  0.62  0.71  0.36
#  0.43  0.28  0.65  0.36  0.40  0.78  0.84  0.54
#  0.69  0.27  0.57  0.43  0.84  0.36  0.24  0.09

Applying 应用

left <- c(1,3,5,7)
right <- c(2,4,6,8)
as.data.frame(t(mapply(function(x,y,z) df[x,y:z],
                       x=seq_len(nrow(df)),y=left,z=right)))

yields 产量

#    X1    X2
#  0.51  0.09
#  0.43  0.60
#  0.40  0.78
#  0.24  0.09

In order for that to work, each range defined by left and right must contain the same number of elements. 为了使它起作用,由leftright定义的每个范围必须包含相同数量的元素。 Furthermore, both left and right must contain as many elements as there are rows in df . 此外, leftright必须包含与df中的行一样多的元素。

As mentioned problem is not clear, hope below example will give some hints: 如前所述问题尚不清楚,希望下面的例子能给出一些提示:

#dummy data
df <- data.frame(matrix(runif(20,1,50),nrow=4))

#right left dummy
right <- c(1,3,4)
left <- c(5,4,5)

#nth value, also try n <- c(2,4) to get 2nd and 4th rows
n <- 2

#return list of data.frames
lapply(1:length(right),
       function(x) df[n,right[x]:left[x]])

Without more information, your problem is ill-posed, because there's no guarantee that the number of items you want in each row will be the same. 没有更多信息,您的问题就很糟糕,因为无法保证每行中所需的项目数是相同的。 Remember that a data frame is a rectangular object, ie all the rows must have the same length. 请记住,数据框是一个矩形对象,即所有行的长度必须相同。

What would be more reasonable to obtain is a list , which doesn't have this restriction: 获得更合理的是list ,它没有此限制:

mapply(function(l, r) df[, l:r], left, right, SIMPLIFY=FALSE)

Assuming this results in the same number of items per row, you can then combine them with rbind : 假设每行有相同数量的项目,则可以将它们与rbind结合使用:

do.call(rbind, mapply(function(l, r) df[, l:r], left, right, SIMPLIFY=FALSE))

There are other issues, eg you're potentially combining items from different columns together which would be nonsensical if they have different classes. 还有其他问题,例如,您可能会将不同列中的项目组合在一起,如果它们具有不同的类,那将是毫无意义的。 But you haven't mentioned this as a problem, so I'm going to assume that your data frame is really more akin to a matrix for which this kind of manipulation is more sensible. 但是您还没有提到这是一个问题,因此我将假设您的数据帧实际上更类似于矩阵,对于这种矩阵而言,这种操作更为明智。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM