[英]Subset in the data frame rows in R
I have a data frame with 30 rows and 4 columns (namely, x, y, z, u
).我有一个 30 行 4 列的数据框(即
x, y, z, u
)。 It is given below.下面给出。
mydata = data.frame(x = rnorm(30,4), y = rnorm(30,2,1), z = rnorm(30,3,1), u = rnorm(30,5))
Further, I have a sequence values, which represent row number in my data frame.此外,我有一个序列值,它表示我的数据框中的行号。
myseq = c(seq(1, 30, by = 5))
myseq
[1] 1 6 11 16 21 26
Now, I wanted to compute the prob
values for each segment of 99 rows.现在,我想计算每个 99 行段的
prob
值。
filt= subset(mydata[1:6,], mydata[1:6,]$x < mydata[1:6,]$y & mydata[1:6,]$z < mydata[1:6,]$u
filt
prob = length(filt$x)/30
prob
Then I need to compute the above prob
for 1:6
,.., 27:30
and so on .然后我需要计算
1:6
,.., 27:30
等的上述prob
。 Here, I have only 6 prob
values.在这里,我只有 6 个
prob
值。 So, I can do one by one.所以,我可以一一做。 If I have 100 values it would be tedious.
如果我有 100 个值,那会很乏味。 Are there any way to compute the
prob
values?.有没有办法计算
prob
值?
Thank you in advance.先感谢您。
BTW: in subset(DF[1:99,], ...)
, use DF[1:99,]
in the first argument, not again, ala顺便说一句:在
subset(DF[1:99,], ...)
,在第一个参数中使用DF[1:99,]
,不再重复,ala
subset(DF[1:99,], cumsuml < inchivaluel & cumsumr < inchivaluer)
Think about how to do this in a list
.考虑如何在
list
执行此操作。
The first step is to break your data into the va
starting points.第一步是将您的数据分解为
va
起点。 I'll start with a list of the indices to break it into:我将从索引列表开始,将其分解为:
inds <- mapply(seq, va, c(va[-1], nrow(DF)), SIMPLIFY=FALSE)
this now is a list of sequences, starting with 1:99
, then 100:198
, etc. See str(inds)
to verify.这现在是一个序列列表,从
1:99
开始,然后是100:198
等。请参阅str(inds)
进行验证。
Now we can subset a portion of the data based on each element's vector of indices:现在我们可以根据每个元素的索引向量对数据的一部分进行子集化:
filts <- lapply(inds, function(ind) subset(DF[ind,], cumsuml < inchivaluel & cumsumr < inchivaluer))
We now have a list of vectors, let's summarize it:我们现在有一个向量列表,让我们总结一下:
results <- sapply(filts, function(filt) length(filt$cumsuml)/length(alpha))
Bottom line, it helps to think about how to break this problem into lists, examples at http://stackoverflow.com/a/24376207/3358272 .最重要的是,考虑如何将这个问题分解为列表会有所帮助,例如http://stackoverflow.com/a/24376207/3358272 。
BTW: instead of initially making a list of indices, we could just break up the data in that first step, ala顺便说一句:不是最初制作索引列表,我们可以在第一步中分解数据,ala
DF2 <- mapply(function(a,b) DF[a:b,], va, c(va[-1], nrow(DF)), SIMPLIFY=FALSE)
filts <- lapply(DF2, function(x) subset(x, cumsuml < inchivaluel & cumsumr < inchivaluer))
results <- sapply(filts, function(filt) length(filt$cumsuml)/length(alpha))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.